Modern regression problems are increasingly complex and often comprise a large number of features. This increases the chances that a substantial portion of them may be redundant and/or contain outlying values that can hinder classical estimation methods. Focusing on linear models, we contribute to this area developing a general framework for simultaneous feature selection and outlier detection based on mixed-integer programming techniques. This is robust against outliers in the response and/or the design matrix, through the estimation of binary units’ weights, and provides optimality guarantees from both optimization and theoretical standpoints. Computationally lean heuristics are also developed, as well as its extensions to clusterwise regression, where non-outlying cases belong to a Gaussian mixture model, and logistic regression. Finally, we consider the estimation of continuous units’ weights which, unlike commonly employed robust M-estimators, allows one to assign full weights to non-outlying observations, exclude the most aberrant observations from the fit, and down-weight milder outliers. The superior performance of our proposals is shown through simulations and real-world applications related to the "Omics'' sciences and entomology.