3.2 Regularization Methods for Categorical Predictors (Gerhard Tutz)

Contenu fourni par Universite Paris 1 Pantheon-Sorbonne. Tout le contenu du podcast, y compris les épisodes, les graphiques et les descriptions de podcast, est téléchargé et fourni directement par Universite Paris 1 Pantheon-Sorbonne ou son partenaire de plateforme de podcast. Si vous pensez que quelqu'un utilise votre œuvre protégée sans votre autorisation, vous pouvez suivre le processus décrit ici https://fr.player.fm/legal.

StatLearn 2010 - Workshop on "Challenging problems in Statistical Learning"
3.2 Regularization Methods for Categorical Predictors (Gerhard Tutz)

9+ y ago 53:33

MP4•Maison d'episode

Série archivée ("Flux inactif" status)

When? This feed was archived on June 29, 2023 09:11 (10M ago). Last successful fetch was on August 01, 2022 18:06 (1+ y ago)

Why? Flux inactif status. Nos serveurs ont été incapables de récupérer un flux de podcast valide pour une période prolongée.

What now? You might be able to find a more up-to-date version using the search function. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

The majority of regularization methods in regression analysis has been designed for metric predictors and can not be used for categorical predictors. A rare exception is the group lasso which allows for categorical predictors or factors. We will consider alternative approaches based on penalized likelihood and boosting techniques. Typically the operating model will be a generalized linear model. We will start with ordered categorical predictors which unfortunately are often treated as metric variables because software is available. It is shown how difference penalties on adjacent dummy coefficients can be used to obtain smooth effect curves that can be estimated also in cases where simple maximum likelihood methods fail. The difference penalty turns out to be highly competitive when compared to methods often seen in practice, namely simple linear regression on the group labels and pure dummy coding. In a second step L1-penalty based methods that enforce variable selection and clustering of categories are presented and investigated. It is distinguished between ordered predictors where clustering refers to the fusion of adjacent categories and nominal predictors for which arbitrary categories can be fused. The methods allow to identify which categories do actually differ with respect to the dependent variable. Finally interaction effects are modeled within the framework of varying coefficients models. For the proposed methods properties of the estimators are investigated. Methods are illustrated and compared in simulation studies and applied to real world data.

12 episodes

#Éducation #Universite Paris 1 Pantheon-Sorbonne #Vidéo #Enseignement Supérieur