The authors present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks.The authors prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which, as they prove, converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns task-specific functions and in the latter step it learns common-across tasks sparse representations for these functions.The authors also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. The authors report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks.Their algorithm can also be used, as a special case, to simply select, not learn, a few common variables across the tasks.