@frankzhu

Options Pricing with Black-Scholes and Gradient Ascent from scratch

January 20, 2024 / 7 min read

Last Updated: January 20, 2024

As professor Richard Berner once said, "all models are wrong, but some are useful." In financial economics, we have a plethora of models to choose from, each with its own assumptions and limitations.

Last time, we did credit forecasting with the Nelson-Siegel model. This time, we will delve into the world of options pricing. In this post, we will explore the famous Black-Scholes option pricing model to estimate the price of a call option on Apple stock.

Along the way, we will cover the application of the following topics:

  • ArrowAn icon representing an arrow
    Black-Scholes Options Pricing
  • ArrowAn icon representing an arrow
    Maximum Likelihood Estimation via Gradient Ascent
  • ArrowAn icon representing an arrow
    Closed-Form Estimation of Volatility & Fisher Information Matrix

Apple stock was selling at 177.70 per share at the close of 10/31/2023. We will use to calculate the price of a call option with a strike price of 175 assuming an expiration date of 11/30/2023.

1. Retrieve Historical Data

1
library('quantmod')
2
library('maxLik')
3
4
ticker <- 'AAPL'
5
start_date <- as.Date("2023-03-01")
6
end_date <- as.Date("2023-10-31")
7
getSymbols(ticker, src = "yahoo", from = start_date, to = end_date)
8
apple_data <- as.data.frame(AAPL)

2. Parametric Estimation

2.1 Log-Likelihood Function

Let , from Ito's lemma, assuming that they are normally distributed with mean and variance :

Then for any we have the density function (note that there is no , which was a typo):

For observations, the joint likelihood function for continuously compounded returns is:

Taking the natural logarithm of the likelihood function, we convert the product into a sum:

In , we program the log-likelihood function as follows:

1
# negative log-likelihood function
2
log_likelihood <- function(params, log_returns, delta) {
3
alpha <- params[1]
4
sigma <- params[2]
5
n <- length(log_returns)
6
7
if(sigma <= 0) return(-Inf) # Ensure sigma is positive
8
9
# Calculate the log likelihood
10
ll <- -n/2 * log(2 * pi * sigma^2 * delta) - 1/(2 * sigma^2 * delta) * sum((log_returns - alpha * delta)^2)
11
return(ll)
12
}

2.2 Gradient Ascent and Maximum Likelihood

We apply the gradient ascent algorithm to maximize the log-likelihood function by iteratively moving towards the maximum value of the function. The gradient are found as follows:

1
gradient <- function(params, log_returns, delta) {
2
alpha <- params[1]
3
sigma <- params[2]
4
n <- length(log_returns)
5
6
dL_dalpha <- 1/(sigma^2) * sum(log_returns - alpha * delta)
7
dL_dsigma <- -n/sigma + 1/(delta * sigma^3) * sum((log_returns - alpha * delta)^2)
8
return(c(dL_dalpha, dL_dsigma))
9
}

We then apply optimization with initial parameters guesses and proceed with small increments (determined by our learning rate) for large iterations or when the model converges:

Gradient Ascent

1
# Gradient ascent function (this is gradient ascent for log-likelihood maximization)
2
gradient_ascent <- function(log_returns, delta, max_iter = 1000000, learning_rate = 1e-6, tolerance = 0.0001) {
3
# Initial guess
4
params <- c(alpha = mean(log_returns), sigma = sd(log_returns))
5
n <- length(log_returns)
6
cost_history <- numeric(max_iter) # To track cost history for debugging
7
8
for(i in 1:max_iter) {
9
grad <- gradient(params, log_returns, delta)
10
11
# Update parameters proposal (gradient ascent so we ADD the gradient)
12
new_params <- params + learning_rate * grad
13
14
# Ensure sigma is not non-positive after the update
15
if (new_params[2] <= 0) {
16
new_params[2] <- max(params[2], 1e-8) # Revert sigma to its previous value or set to a small positive value
17
}
18
19
params <- new_params
20
21
# Calculate the likelihood (for monitoring purposes, since we maximize it)
22
likelihood <- log_likelihood(params, log_returns, delta)
23
24
# Check for convergence
25
if (all(abs(grad) < tolerance)) {
26
cat("Convergence achieved after", i, "iterations.\n")
27
break
28
}
29
}
30
31
return(params) # Return parameters and cost history
32
}
33
34
# Run the gradient ascent
35
estimated_params <- gradient_ascent(log_returns, delta)
36
# Print the results
37
print(estimated_params)

2.3 Closed-Form Results

We check for the robustness of the gradient ascent by comparing the results against the closed-form estimates:

And the results match:

1
# Calculate alpha_hat and sigma_hat using closed form solutions for verification
2
alphahat <- 1/(length(log_returns)*delta) * sum(log_returns)
3
sigmahat <- sqrt(1/(length(log_returns)*delta) * sum((log_returns - alphahat*delta)^2))
4
5
cat("Closed form alpha_hat:", alphahat, "\n")
6
cat("Closed form sigma_hat:", sigmahat, "\n")
7
8
9
cat("Predicted alpha:", estimated_params[1], "\n")
10
cat("Predicted sigma:", estimated_params[2], "\n")

Where we find closed form and to be and respectively, which matches the predicted values exactly!

2.4 Fisher Information Matrix and Variance of the Estimator

To obtain the asymptotic variance of , per Greene, we invert the negative of the Hessian Matrix , or observed Fisher Information matrix and take the diagonal element of the resulting variance covariance matrix associated with :

Hessian matrix and positive log-likelihood

1
# Define the Hessian matrix for the positive log-likelihood
2
hessian_matrix <- function(params, log_returns, delta) {
3
alpha <- params[1]
4
sigma <- params[2]
5
n <- length(log_returns)
6
7
# The elements of the Hessian matrix
8
d2L_dalpha2 <- -n * delta/ (sigma^2)
9
d2L_dalphadsigma <- -2*(sigma^-3) * sum((log_returns - alpha*delta))
10
d2L_dsigma2 <- -3/delta * sum((log_returns - alpha * delta)^2) * sigma^(-4) + n*sigma^(-2)
11
12
# Construct the Hessian matrix
13
hessian <- matrix(c(d2L_dalpha2, d2L_dalphadsigma,
14
d2L_dalphadsigma, d2L_dsigma2), nrow = 2)
15
return(hessian)
16
}
17
18
# Calculate the Hessian at the estimated parameters
19
obs_hessian <- hessian_matrix(estimated_params, log_returns, delta)
20
21
# Invert the Hessian to get the variance-covariance matrix
22
var_cov_matrix <- solve(-obs_hessian) # solve() is the base R method for matrix inversion
23
24
# Extract the variance of sigma from the matrix
25
var_sigma <- var_cov_matrix[2, 2] # The variance of sigma is in the second row, second column
26
27
# Print the estimated variance of sigma
28
print('Variance of sigma:')
29
print(var_sigma)
1
"Variance of sigma: 5.633948e-06"

3. Call Option Price

Finally, from Black-Scholes, we can find the European style AAPL call Option to be:

Black-Scholes Call Option Pricing

1
# Given values
2
P_t <- 177.70 # Current price of the stock
3
K <- 175 # Strike price
4
r <- 0.05 # Risk-free interest rate (annual)
5
sigma <- estimated_params[2] # Volatility from the given 'estimated_params'
6
7
# Time to expiration (1 month assumed as 1/12 of a year)
8
T_t <- 21/252
9
10
# Black-Scholes formula components
11
h1 <- (log(P_t/K) + (r + sigma^2/2) * T_t) / (sigma * sqrt(T_t))
12
h2 <- h1 - sigma * sqrt(T_t)
13
14
# Using pnorm for the cumulative distribution function of the standard normal distribution
15
N_d1 <- pnorm(h1)
16
N_d2 <- pnorm(h2)
17
18
# Black-Scholes call option price formula
19
call_option_price <- P_t * N_d1 - K * exp(-r * T_t) * N_d2
20
21
# Output the result
22
cat(call_option_price)
23
24
cat("Option premium per share:", call_option_price,)
25
cat("Option price:", call_option_price * 100,)
1
"Option premium per share: 3.485288"
2
"Option price: 348.5288"

Have a wonderful day.

– Frank

Estimating the price of a call option on Apple stock using the Black-Scholes model and maximum likelihood estimation