Using a neural network to solve the Poisson equation

  • #1
docnet
Gold Member
788
460
TL;DR Summary
I coded a neural network model to predict solutions to ##u''(x)=100\sin(5x)## over ##[-1,1]## using Tensorflow. The boundary conditions at ##x=-1## and ##x=1## uniquely determine the solutions. I'm using a very basic model architecture with the relu activation function.
To train the model, I generated a set of deterministic solutions with random boundary conditions ##u(-1)=a## and ##u(1)=b##. I then added a small amount of noise to these solutions. However, the model's accuracy is significantly worse compared to the most basic finite difference methods. Is there anything that I did wrong in the code below? Thanks in advance for your input regarding the model, suggestions to make the project more interesting, etc. :bow: :smile:


Can someone with more experience provide guidance on improving this project? Additionally, does anyone have ideas to make this project more interesting?

Python:
# data generation function
def generate_data(num_samples):
    '''generates boundary conditions and solution points'''
  
    boundary_conditions = np.random.uniform(-1, 1, size=(num_samples, 2))
  
    x = np.linspace(-1,1,100)

    # loop for the output
    solution_points = []
    for row in boundary_conditions:
        a, b = row
        # determine coefficients
        c_1 = 4* np.sin(5) - (a/2) + (b/2)
        c_2 =  a/2 + b/2
        # define solution
        u = lambda x: -4 * np.sin(5* x) + c_1*x + c_2
      
        output = u(x)
        solution_points.append(output)


    return boundary_conditions, np.array(solution_points)
Code:
# Add noise function
def add_noise(data, noise_level=0.1):
    noise = np.random.normal(-noise_level, noise_level, data.shape)
    return data + noise

# Generate base data
X, y = generate_data(100000)

# Split data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)


# Add noise to the training data
noise_level = 0.1
X_train_noisy = add_noise(X_train, noise_level)
y_train_noisy = add_noise(y_train, noise_level)



print("X_train_noisy shape:", X_train_noisy.shape)  # (800, 2)
print("y_train_noisy shape:", y_train_noisy.shape)  # (800, 100)
print("X_val shape:", X_val.shape)    # (200, 2)
print("y_val shape:", y_val.shape)    # (200, 100)

'''
print("X_train_shape:", X_train.shape)  # (800, 2)
print("y_train shape:", y_train.shape)  # (800, 100)
print("X_val shape:", X_val.shape)    # (200, 2)
print("y_val shape:", y_val.shape)    # (200, 100)'''


# Define the model
model = Sequential([
    Dense(64, activation='relu', input_shape=(2,)),  # Input layer with 2 neurons
    Dense(128, activation='relu'),                   # Hidden layer with 128 neurons
    Dense(256, activation='relu'),                   # Hidden layer with 256 neurons
    Dense(128, activation='relu'),                   # Hidden layer with 128 neurons
    Dense(100)                                       # Output layer with 100 neurons
])

# Compile the model
model.compile(optimizer='adam', loss='mse')


# Set up early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)


# Train the model with noisy training data and clean validation data
history = model.fit(X_train_noisy, y_train_noisy, epochs=50, batch_size=32, validation_data=(X_val, y_val), callbacks = [early_stopping])

'''
# Train the model with clean training data and clean validation data
history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_val, y_val))'''


# Evaluate the model on the validation set
val_loss = model.evaluate(X_val, y_val)
print(f"Validation loss: {val_loss}")
Screenshot 2024-08-03 at 10.04.00 PM.png

Code:
# Example prediction
a = .5
b = -.5
boundary_conditions_test = np.array([[a, b]])  # Example test boundary conditions
predicted_solution = model.predict(boundary_conditions_test)


#test the prediction

# determine coefficients
c_1 = 4* np.sin(5) - (a/2) + (b/2)
c_2 =  a/2 + b/2
u = lambda x: -4 * np.sin(5* x) + c_1*x + c_2
x = np.linspace(-1,1,100)

error2 = np.linalg.norm(u(x)-predicted_solution[0],ord = 2)
error = np.linalg.norm(u(x)-predicted_solution[0],ord = np.infty)


# plot the prediction
fig = plt.figure(figsize=(12,7))
axes = fig.add_subplot(1, 1, 1)
axes.plot(x, u(x), 'k',label="u_true", markersize = '5')
axes.plot(x, predicted_solution[0], 'bo',label="Predicttion".format(error), markersize = '5')

plt.plot([], [], ' ', label= 'global Error={}'.format(error2))
plt.plot([], [], ' ', label='max error = {}'.format(error))

axes.set_xlabel("$x$", fontsize= 20)
axes.set_ylabel("$u(x)$", fontsize= 20)
axes.grid(True)
fig.patch.set_facecolor('xkcd:sky')
axes.set_facecolor("xkcd:very light blue")
axes.set_title(label = "u(x)", fontsize = 20)
plt.legend(fontsize = 13)
plt.show()
Screenshot 2024-08-03 at 10.05.04 PM.png
 
Computer science news on Phys.org
  • #3
hmm, I want to show how a nn can be used predict solutions using noisy experiment data, when the solution isn't known. Something that conventional numerical methods can't do. It's more of a write-up to display basic understanding about nn and its possible uses. It's still a toy project so I want to make it more complex and interesting.

The nn algorithm as it stands is not accurate. The global error gets as low as .03 which is still huge compared to the basic finite difference method. I want to know how to get the model as accurate as it can be, before I introduce any complexity to the algorithm.

Some things I've tried: Introducing more layers, using different activation functions, using dropout layers, L2 regularization, and increasing the number of epochs have not been successful. Increasing the size of the training data has been mildly successful but has diminishing returns.


To make it more interesting instead of the 1-D Poisson equation, I could focus on a DE that doesn't have closed-form solutions. Maybe I could try to make hybrid method using the nn and a spectral method to solve the inviscid Burgers' equation or something.
 
  • #4
Honestly, I think ML is not the right tool for the job.

If you know (or suspect) the functional form, you can use classical statistics to fit it.

If you don't know the functional form, the AI can only figure it out if it was in the training dataset. But if you anticipated that possibility, you could also fit to that function and do just as well with classical statistics. And a lot less computation.

Finally, if you want ML to tell you which of two functions best fit the data when classical statistics tells you they do about equally well, it won't do that for you. If both fit adequately, then both fit adequately.

There are things AI/ML is good at. This is not one of them. Like FizzBuzz.
 
  • #5
There is another approach called Eureka that uses genetic algorithms to discover the underlying curve in a collection or related data.

https://en.wikipedia.org/wiki/Eureqa

Basically the issue you're facing is what tool is best used for the given problem you are trying to solve. It's not a good idea to use a hammer for every project.
 
  • #6
That's a valid point. Neural networks are more often used for classification tasks because of its probabilistic nature, but there are valid STEM research that use and develop nn for solving PDEs. This includes a good friend of mine who solves cancer growth-related PDEs using machine learning for her post doc. Since I'm not solving PDEs related to cancer, developing the next cutting-edge classification algorithm or doing physics at cern, :smile: the model's task is simple.

I switched to Sigmoid activation function, deleted 2 hidden layers and lowered the number of nodes in the one remaining middle layer. I'm happy that the result is consistently around twice as accurate as before, despite being simpler.

I'll experiment some more, document my findings in detail, and move on.

Next, I think I'll experiment with creating nns for stochastic forecasting and solving solving stochastic DEs.
 
  • #7
If you are trying to solve PDE's with AI/ML, I am pretty sure how it's going to turn out. You are effectively going to implement 60's and 70's style analog computers with multilayer perceptons.

Where AI/ML does best is with a relatively large number of correlated variables, where it can extract the important (and by extension unimportant) correlations. Lots of people instead adopt the strategy of dumping everything in tensorflow and hoping. That tends not to work any better than what we already had, at great computational cost.
 
  • #8
Just to be clear, this thread is about Neural Networks. There are other forms of AI and ML.
 
  • #9
I've used MNIST data to train a convoluted neural network for classification tasks, as part of a class project. It seems that neural networks are much better suited for those tasks than solving equations that require much higher accuracies.

The Chebyshev spectral method I learned in Kyle Mandli's class can solve the same 1-D Poisson equation to around 1e-16, which is close to machine precision, in virtually no computation time.
Screenshot 2024-08-04 at 6.39.23 PM.png






This morning I found out it's possible to solve an ode to 5-4 using a neural network with just 11 nodes, using a self-supervising technique. I'm learning how to do something similar right now.
 

Similar threads

  • Programming and Computer Science
Replies
2
Views
1K
  • Programming and Computer Science
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
7
Views
6K
  • Programming and Computer Science
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
9
Views
2K
  • Linear and Abstract Algebra
Replies
2
Views
2K
  • Differential Equations
Replies
4
Views
4K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
2K
Back
Top