Reinforcement learning for plants

January 25, 2025

This blog describes our attempt to use reinforcement learning to grow plants by learning a lighting schedule from data.

Markov Decision Process (MDP)

We first frame the problem as a Markov Decision Process (MDP).

Reward function

Difference in log area

$R_{t+1} = \ln(\text{Area}_{t+1})-\ln(\text{Area}_{t})$

The return is invariant to the initial area.
The rewards are invariant to the size of the plant, this is desirable because when training the neural network the error should not be dominated by the size of the plant. The return is the log-ratio of the final and initial area. We can interpret the return as the number of doublings of the plant area. The exponential of the return is the ratio of the final and initial area. We actually use $ln(x+1)$ to avoid issues with zero area.

$G = \ln(\text{Area}_{T})-\ln(\text{Area}_{0}) = \sum_{t=0}^{T-1} R_{t+1}$

Alternative Reward functions

Other alternative reward functions were considered.

Difference in area

$\text{Area}_{t+1} - \text{Area}_{t}$

The return is not invariant to the initial area. The rewards grow exponentially with the size of the plant.

Percentage change in area

$\frac{\text{Area}_{t+1}}{\text{Area}_{t}} - 1$

While this reward function induces a return that is invariant to the initial area, it can favour volatility in area sequences.

For example, a plant that is $100 \text{ mm}^2$ that grows to $110 \text{ mm}^2$ then shrinks back to $100 \text{ mm}^2$ gets a return of $G = \frac{110 - 100}{100} + \frac{100 - 110}{110} = 0.1 + (-0.091) = 0.009$ . The plant did not grow, so the return should be zero.

Difference in area divided by initial area

$\frac{N_{Area}-C_{Area}}{I_{Area}}$

While this reward function induces a return that is invariant to the initial area, and the return is equal to the total growth factor, it has rewards that are exponential with the plant size.

Methods

Offline RL

Deployment

How do we do inference? mean plant embedding seems to be a reasonable choice (maybe only mean of those that are “alive”)