Menu
Log in


Recommendation for an appropriate model

  • 21 Aug 2025 5:54 PM
    Reply # 13533918 on 13524649

    Thanks for your witty reply Duncan. I have got some insights from my ILP coauthor. He was menused that I was trying to predict computation times. Nobody does it apparently. Because it is as impossible as solving the problem in polynomial time.

    It turns out that there is a practical solution though, which is what I was after. You can stop the algorithm (I am using Gurobi) after time=T and give a bound on how far away you are from an upper bound on the objective. Which willl work well enough for my purposes. 

  • 18 Aug 2025 5:29 PM
    Reply # 13532824 on 13532756
    Duncan Lowes wrote:

    Lively stats and analytics forum Chris

    Spent a while looking into this problem for you - but any and all relevant knowledge I had is either very old, very rusty or non existent

    I am very rusty on types of ILP and levels of hardness and complxity, and it crossed my mind you may not be able to predict it well at all, but if you could could you not combine a logistic component with your other model. Gave me a chance to simulate some data and try to play with different models. Haven't done it for a while

    Are you able to disclose the nature of theI LP and how many parameters etc

    But my gut feel was that it sounds a bit conceptually like trying to predict if a number is prime or not. You know they are out there, and roughly how often they come along but.. you made it sound quite Hard to start with. Isn't the nature of integer LPs that they go from a perfect fit to something very complex without being able to know when. I feel a bit embarassed because I am sure back in some day I would have had a better understanding and cognitive ability and practice to think abut it more

    But thanks as always for posting someting engaging to read and think about , and I am no longer ashamed to admit I spent time talking to Chat GPT and Grok on the matter

    PS sorry for rambing but it even crossed my mind that generating enough training data for such a problem makes the problem infeasible. I am just thinking generally without knowing anythng about the problem. Maybe some toy problem could work simply and conceptually. Like leaving/including a huge item out of a knapsack.. What is the objective  for example
    Last modified: 18 Aug 2025 5:57 PM | Duncan Lowes
  • 18 Aug 2025 10:17 AM
    Reply # 13532756 on 13529577

    Lively stats and analytics forum Chris

    Spent a while looking into this problem for you - but any and all relevant knowledge I had is either very old, very rusty or non existent

    I am very rusty on types of ILP and levels of hardness and complxity, and it crossed my mind you may not be able to predict it well at all, but if you could could you not combine a logistic component with your other model. Gave me a chance to simulate some data and try to play with different models. Haven't done it for a while

    Are you able to disclose the nature of theI LP and how many parameters etc

    But my gut feel was that it sounds a bit conceptually like trying to predict if a number is prime or not. You know they are out there, and roughly how often they come along but.. you made it sound quite Hard to start with. Isn't the nature of integer LPs that they go from a perfect fit to something very complex without being able to know when. I feel a bit embarassed because I am sure back in some day I would have had a better understanding and cognitive ability and practice to think abut it more

    But thanks as always for posting someting engaging to read and think about , and I am no longer ashamed to admit I spent time talking to Chat GPT and Grok on the matter

    Last modified: 18 Aug 2025 10:39 AM | Duncan Lowes
  • 8 Aug 2025 11:18 AM
    Reply # 13529577 on 13524649

    Thanks Jimmy, Could you suggest an R-package where you can specify parametric for some predictors and non-parametric for others?

  • 1 Aug 2025 4:49 PM
    Reply # 13527107 on 13524649

    Using a hybrid modeling approach could be worth exploring. For example, combining a tree-based method like gradient boosting for capturing local non-linearities with a parametric model that specifically accounts for sample size effects might help with extrapolation, especially when the computation time explodes.

  • 25 Jul 2025 2:13 PM
    Message # 13524649

    I am wanting to predict computation time from a set of parameters theta and sample size vector n. The computation is actually an integer linear program and I have about 200,000 runs under varying conditions. The algorithm involves some randomisation so I have included some duplication to estimate pure error.

    The problem is that time = f(theta,n) has some unusual features. The relationship involves some hot and cold spots where computation is higher or lower. If these were the only parameters then nearest neighbours or a random forest would possibly model it well. The dependence on the sample sizes however has some special features: (a) it can be erratic. A small increase in sample size can lead to a much longer  computation time, which then disappears for the next larger sample size; (b) since the problem is NP-hard, computation times does explode at a certain point.

    I have limited data where the computation time is exploding – for obvious reasons. My aim would be to have a prediction model that can tell me (or the user) when computation time is likely to be say greater than 30 mins (and I have very few instances of this).

    I was trying to think what kind of model would pick up the anomalous patterns but also extrapolate to larger sample sizes where times will explode. My intuition is that flexible non-parametric methods do not extrapolate well.

    Would anyone have a suggestion for what kind of model I might use (within R-studio)?


Powered by Wild Apricot Membership Software