I am wondering about applications of dual variables that are used in the Alternate Direction of Method of Multipliers (ADMM). In the consensus form of ADMM they grow until they have forced consensus among variables that are treated separate for ease of computation. Once convergence occurs they are usually thrown away as just a means to and end.
But it seems to me that they have usefulness in their own right. For example they tell you how much pressure is needed to force the data into your model. If they are zero, it says that the feature associated with it is not needed and can be pruned. In other words, they tell you how to improve or discover your model if the model is not meaningful in itself.
I'm wondering if anyone knows any examples of this being put to use.
Let me give an example. Let's say you are using ADMM to solve a continuous approximation to a discrete model. For example, you are using a relaxation or the convex-concave procedure to solve a non-convex problem involving Booleans. At the end of the day the associated dual variables forcing values to Booleans tell you how hard the data wants to pull that value away from that Boolean.
This could be very useful if one wanted to fine tune the answer with a discrete neighbor search. In a large parameter space you have lots of choices of which Booleans to try flipping. But the dual variable's magnitude tell you which ones to try first. Seems that this would be extremely useful in any kind of discrete search.
Here is another for anomaly detection. In some models you can apply ADMM so that effectively each data point has it's own set of parameters. Then you do consensus ADMM to force them to be equal for all data points. Again, this is usually done just to ease the fitting procedure. But at the end, the dual variables are telling you both which features are under pressure to move multiple ways and which data points are outliers.
Does anyone know of any anomaly detection methods based on this concept?