Clustering Azimuths: Algorithm & Link

  • Thread starter billiards
  • Start date
In summary, the conversation is about clustering a set of azimuths, with consideration for the fact that 355 is close to 0. The conversation also mentions using an algorithm and specifically asks for help with a python-specific solution. There is a suggestion to use a 2-dimensional transformation and a question about the best clustering algorithm to use.
  • #1
billiards
767
16
Hi all,

If I have a set of azimuths, e.g. [ 0, 10, 11, 67, 68, 69, 70, 124, 127, 136, 355].

How can I cluster these directions bearing in mind that 355 is close to 0?

Can someone point me to a link, preferably with an algorithm I can use.

Cheers
 
Mathematics news on Phys.org
  • #2
I'm sure there are algorithms for clustering on a (1-dimensional) torus. A quick google search pointed me to this.
If you don't find one, this hack might work: transform your one-dimensional distribution to a circle in 2 dimensions: x -> (sin x, cos x), and look for clusters there.
 
  • #3
Nice idea to use sinx, cosx.

Any idea what the best clustering algorithm to use would be? Ideally I want something that can figure out the optimum number of clusters itself from the data.

(Incidentally I am trying to do this using python -- so any python specific help would be particularly appreciated)
 
  • #5


Hello,

Thank you for your question. Clustering azimuths is a common problem in many scientific fields, including geology, astronomy, and meteorology. There are several approaches that can be used to cluster these directions, depending on the specific application and data set.

One approach is to use a circular statistics method, such as the circular k-means algorithm. This method takes into account the circular nature of azimuths and can handle cases where 355 is close to 0. The algorithm can be implemented using various programming languages, such as R or Python, and there are many online resources that provide step-by-step instructions and code examples.

You may also find this paper helpful: "Circular K-means clustering for directional data" by L. Batschelet (1981). It provides a detailed explanation of the algorithm and its applications.

Additionally, there are software packages such as Oriana and OrianaJ that have built-in circular k-means clustering functions and can handle large data sets.

I hope this helps. Best of luck with your research!
 

FAQ: Clustering Azimuths: Algorithm & Link

What is the purpose of clustering azimuths?

The purpose of clustering azimuths is to group together similar directional data points in order to identify patterns or relationships within the data. This can be useful for various applications such as data analysis, visualization, and prediction.

2. How does the clustering azimuths algorithm work?

The clustering azimuths algorithm works by first randomly selecting a set of points from the data and assigning them as initial cluster centroids. Then, each data point is assigned to the cluster with the nearest centroid. The centroids are then updated by calculating the mean of the data points assigned to each cluster. This process is repeated until the centroids no longer change significantly.

3. What factors should be considered when choosing the number of clusters?

The number of clusters should be chosen based on the specific data and application. Some factors to consider include the size and complexity of the data, the desired level of granularity, and the goals of the analysis. It may also be helpful to visually inspect the results of different cluster numbers to determine the optimal choice.

4. Can the clustering azimuths algorithm handle large datasets?

Yes, the clustering azimuths algorithm can handle large datasets. However, as the size of the dataset increases, so does the computation time and memory required. It may be necessary to optimize the algorithm or use parallel processing techniques for extremely large datasets.

5. How can the results of clustering azimuths be evaluated?

The results of clustering azimuths can be evaluated by comparing the clusters to known or expected patterns in the data, or by using metrics such as silhouette coefficient or sum of squared errors. It is important to also consider the specific goals of the analysis and whether the clustering accurately captures the desired relationships in the data.

Back
Top