Another puzzle: produce or learn?

When our kids were small, they were in sports teams (basketballs, baseball, soccer, …).  Their teams would focus on drills early in the season, and tournaments late in the season.  In violin, one studies techniques (scales, etudes, theory, etc.) as well as musicality (interpretation, performance, etc).   In (engineering) research, we spend a lot of time learning the fundamentals (coursework, mathematical tools, analysis/systems/experimental skills, etc.) as well as solving problems in specific applications (research). What is the optimal allocation of one’s effort in these two kinds of activities?

This is a complex and domain-dependent problem.  I suppose there is a lot of serious empirical and modeling research done in social sciences (I’d appreciate pointers if you know any).  But let’s formulate a ridiculously simple model to make a fun puzzle.

1. Consider a finite horizon t = 1, 2, …, T.   The time period t can be a day or a year.  The horizon T can be a project duration or a career.
2. Suppose there are only two kinds of activities, and let’s call them production and learning.  Our task is to decide for each t, the amount of effort we devote to produce and to learn.  Call these amounts p(t) and l(t) respectively.
3. These activities build two kinds of capabilities.  The fundamental capability L(t) at time t depends on the amount of learning we have done up to time t-1, L(t) := L(l(s), s=1, …, t-1).  The production capability P(t) at time t depends on the amount of effort we have devoted to production up to time t-1, P(t) := P(p(s), s=1, …, t-1).   We assume the functions L(l(s), s=1, …, t-1) and P(p(s), s=1, …, t-1) are increasing and time invariant (i.e., they depend only on the amount of effort already devoted, but not on time t).
4. The value/output we create in each period t is proportional to the time p(t) we spend on production multiplied by our overall capability at time t.   Our overall capability is a weighted sum P(t) + mL(t) of fundamental and production capabilities, with m>1.

Goal: choose nonnegative (p(t), l(t), t=1, …, T) so as to maximize the total value ${\sum_{t=1}^T\ p(t) (P(t) + m L(t))}$ subject to ${p(t) + l(t) \leq 1}$ for all t=1, …, T.

The assumption m>1 means that the fundamentals (quality) are more important than mere quantity of production.  The constraint ${p(t) + l(t) \leq 1}$ says that in each period t, we only have a finite amount of energy (assume a total of 1 unit) that can be devoted to produce and learn.  On the one hand, we want to choose a large p(t) because it not only produces value, but also increases future production capabilities P(s), s=t+1, …, T.  On the other hand, since m>1, choosing a large l(t) increases our overall capability more rapidly, enhancing value.  What is the optimal tradeoff?

We pause to comment on our assumptions, some of which can be addressed without complicating our model too much.

Caveats.  On the outset, our model assumes every activity can be cleanly classified as building either the fundamental capability or the production capability.  In reality, many activities contribute to both.  Moreover, the interaction between these two activities is completely ignored, except that they sum to no more than 1 unit.  For example, production (games, performance, research and publication, etc) often provides important incentives and contexts for learning and influences strongly the effectiveness of learning, but our function L is independent of  p(s).  The time invariance assumption in 3 above implies that we retain our capabilities forever after they are built; in reality, we may lose some of them if we don’t continue to practice.  If we think of P(t)+mL(t) as a measure of quality, then our objective function assumes that there is always positive value in production, regardless of its quality.  In reality, production of poor quality may incur negative value, even fatal.

A puzzle

A simple puzzle is the special case where the capabilities depend on (are) the total amounts of effort devoted, i.e., ${L(t)\ := \ \sum_{s=1}^{t-1} l(s), \ \ \ P(t) \ :=\ \sum_{s=1}^{t-1} p(t) }$

Despite its nonconvexity, the problem can be explicitly solved and the optimal strategy turns out to have a very simple structure.  I will explain the solution in the next post and discuss whether it agrees, to first order, with our intuition and how some of the disagreements can be traced back to our simplifying assumptions.

7 thoughts on “Another puzzle: produce or learn?”

1. Shiva Navabi

Here are some studies on different factors that can be effective in mastering a musical instrument:

http://pom.sagepub.com/content/early/2014/05/27/0305735614534910.abstract

Apparently the quality of the time spent on practicing does matter and a concept called “deliberate practice” has emerged to formalize this idea. Therefore, in the present model we may have to use something like “effective” amounts of production / learning that are less than p(t) / l(t):

p_e(t) = k_p p(t) and l_e(t) = k_l l(t) where k_p, k_l p_e(t) -> p(t) and l_e(t) -> l(t)

• Steven Low

Thanks Shiva, for the interesting links! You are right that p(t) and l (t) should be interpreted as effective amounts.

2. Wei-Chun Lee

I’m not sure about the definition for L(1)
should it be a constant or just zero?
I set it zero for simplicity.

Since l(t)+p(t) <= 1
I simply set p(i) = x_i and l(i) = 1-x_i

Then the objective function becomes something like

x2[ x1+ m(1-x1) ] + x3[ x1 +x2 + m(1-x1) + m(1-x2)] ….
rearrange this can get
(1-m)x1( x2+x3+…+xT) + mx2 – (1-m)x2( x3+x_4+….) + …

by differentiate this with respect of each variable and some calculation

I get x_i = (T-i) m/(2m-1)
I guess that p(t) = (T-t) *m/(2m-1) and l(t) = 1-p(t)

There is also an interesting special case.
Suppose kids produce with constant value, which means p(1)=p(2)=…p(T)
then the best choice for p(t) should be m/(2m-1)
and the maximum value for objective function will be m^2/(4(m-1))

• Steven Low

Hi Wei-Chun, yes, I should say explicitly that P(1) = L(1) = 0.

You are also exactly right that you can assume l(t)+p(t)=1 at optimality since the objective is strictly increasing in p(t) and l(t). You can hence eliminate l(t) from the problem, so the objective function depends only on p(t), t=1, …, T. I get the same objective function as you do, but a different optimal strategy from yours….

• Wei-Chun Lee

I examined the process and saw some errors.

Let S = x1+…+xT
I now have S-xi =m(i-1)/(m-1) for the first-order condition.
Observe that S = m/(m-1) * T/2
so the strategy should be
xi = m(T-2i+2)/2(m-1)

And I also found the typo in my comment above.
In the special case that :
p(1)=p(2)=….=p(T)=p
The best strategy should be
p=m/2(m-1) instead of m/(2m-1)

• Steven Low

hmm… my solution is still different, structurally …, would be happy to discuss off-line if you’d like (slow@caltech.edu)