[Python] Optimization: determining gradient with variable window size

In summary, the conversation discusses a program written in Python for analyzing data from a compression experiment. The code includes a section that estimates the gradient of a given quantity, but it is very slow and inefficient. Suggestions are given to improve the code, including using a Gaussian blur and Fourier transforms to calculate the gradient. However, the person who wrote the code has already tried these methods and found them to be noisy or cutting off data. They are open to other optimization tips.
  • #1
Avarus
12
0
Hi all, I'm not quite sure if this is the right place to post my question, so forgive me if its not...

I've written a program in Python that analyses data that I got from a compression experiment (mechanical testing of rocks and such), and I've written a piece of code that estimates the gradient of a given quantity. It seems to work fine, but it is very slow. Whereas the rest of the program does its job in about 10 seconds, this little piece of code takes about 4 minutes to execute.

Code:
# written in Python
while not done:
   i += 10 
   start = n-i
   end = n+i
   if n-i < 0:
      start = 0
      end = n+2*i
   if n+i > len(data)-1:
      end = len(data)-1
      start = n-i
      if start < 0:
         start = 0 
   if absolute(data[start] - data[end]) >= std/0.05:
      done = True
      windows[n] = end - start

Note that this is an excerpt that takes up almost 100% of the function's computation time, so I left out the rest. What does this piece do? It starts with a window of size 10, then checks if the values of the first and last datapoint in this window differ more than a given value (so that the difference is significant). If they do, the window size is correct, if they don't, the window size grows by 20 and check again, etc. The if statements in there are to make sure the window does not fall outside the data.

So I know this method works, but I realize this is horribly inefficient, since most window sizes are about 400 in width, some even up to 2000. Having about 500,000 datapoints to process, this loop occurs about 10,000,000 times for the entire dataset. My mathematical basis for these kind of things is not too good, but do you guys perhaps know a more elegant method? Or do you have any other optimization tips?

Thanks
 
Technology news on Phys.org
  • #2


Your approach is rather sensitive to noise - a single outlier can give you unexpectedly narrow windows at ten-pixel intervals on either side of it. Also, you are handling the ends of the array differently. You either need to change end=n+2*i to just end=n+i in the first if, or start=n-i to start=n-2*i in the second.

If you're not wedded to this idea, I can offer an alternative:
- Smooth your function using a Gaussian blur
- Calculate the gradient of that smoothed function

It turns out that you can do this by populating an array with the first derivative of a Gaussian function, Fourier transforming it, Fourier transforming your data, multiplying the arrays point-by-point, and inverse Fourier transforming the product. If you can install the numpy library, that will do 99.9% of the hard work for you - happy to help.
 
  • #3


Ibix said:
Your approach is rather sensitive to noise - a single outlier can give you unexpectedly narrow windows at ten-pixel intervals on either side of it. Also, you are handling the ends of the array differently. You either need to change end=n+2*i to just end=n+i in the first if, or start=n-i to start=n-2*i in the second.

If you're not wedded to this idea, I can offer an alternative:
- Smooth your function using a Gaussian blur
- Calculate the gradient of that smoothed function

It turns out that you can do this by populating an array with the first derivative of a Gaussian function, Fourier transforming it, Fourier transforming your data, multiplying the arrays point-by-point, and inverse Fourier transforming the product. If you can install the numpy library, that will do 99.9% of the hard work for you - happy to help.

Thanks for the suggestions. What I did not include into this post is that the slope will be calculated using OLS over the number of datapoints that are required to make the slope significant, i.e. the difference between the starting and end point being 20x standard deviation of the noise.

I've tried smoothing the signal and calculating the gradient using central differences, but that still resulted into very noisy results. More smoothing would result into 'cutting corners' off my data...


So again, thanks for your suggestions, but I've been there already.
 

FAQ: [Python] Optimization: determining gradient with variable window size

What is optimization in Python?

Optimization in Python refers to the process of finding the best possible solution for a given problem. It involves selecting the most efficient and effective approach to solve a problem, taking into consideration factors such as speed, accuracy, and resource usage.

What is gradient in Python?

In Python, gradient refers to the slope of a function or curve at a given point. It is used to determine the direction of steepest ascent or descent for a function, which is important for optimization algorithms.

How is gradient calculated in Python?

In Python, gradient is typically calculated using the gradient descent algorithm. This involves taking the partial derivative of the function with respect to each variable and using those values to update the variables in order to minimize the function.

What is variable window size in Python optimization?

In Python optimization, variable window size refers to the ability to adjust the size of the window or range of values used to calculate the gradient. This can help to improve the efficiency and accuracy of the optimization process, especially for complex functions.

What are some common methods for determining gradient with variable window size in Python?

Some common methods for determining gradient with variable window size in Python include stochastic gradient descent, mini-batch gradient descent, and adaptive gradient descent. These methods use different strategies for adjusting the window size and can be selected based on the specific needs of the optimization problem.

Similar threads

Replies
15
Views
2K
Replies
6
Views
3K
Replies
56
Views
8K
Replies
2
Views
1K
Replies
9
Views
2K
Replies
8
Views
2K
Replies
7
Views
2K
Back
Top