Menu
Log in


Recommendation for an appropriate model

  • 8 Aug 2025 11:18 AM
    Reply # 13529577 on 13524649

    Thanks Jimmy, Could you suggest an R-package where you can specify parametric for some predictors and non-parametric for others?

  • 1 Aug 2025 4:49 PM
    Reply # 13527107 on 13524649

    Using a hybrid modeling approach could be worth exploring. For example, combining a tree-based method like gradient boosting for capturing local non-linearities with a parametric model that specifically accounts for sample size effects might help with extrapolation, especially when the computation time explodes.

  • 25 Jul 2025 2:13 PM
    Message # 13524649

    I am wanting to predict computation time from a set of parameters theta and sample size vector n. The computation is actually an integer linear program and I have about 200,000 runs under varying conditions. The algorithm involves some randomisation so I have included some duplication to estimate pure error.

    The problem is that time = f(theta,n) has some unusual features. The relationship involves some hot and cold spots where computation is higher or lower. If these were the only parameters then nearest neighbours or a random forest would possibly model it well. The dependence on the sample sizes however has some special features: (a) it can be erratic. A small increase in sample size can lead to a much longer  computation time, which then disappears for the next larger sample size; (b) since the problem is NP-hard, computation times does explode at a certain point.

    I have limited data where the computation time is exploding – for obvious reasons. My aim would be to have a prediction model that can tell me (or the user) when computation time is likely to be say greater than 30 mins (and I have very few instances of this).

    I was trying to think what kind of model would pick up the anomalous patterns but also extrapolate to larger sample sizes where times will explode. My intuition is that flexible non-parametric methods do not extrapolate well.

    Would anyone have a suggestion for what kind of model I might use (within R-studio)?


Powered by Wild Apricot Membership Software