How to analyse a sequence of vehicle states?

In summary, analyzing a sequence of vehicle states involves systematically examining the various conditions and behaviors of a vehicle over time. This process typically includes data collection from sensors, understanding state transitions, and applying algorithms to track and predict vehicle dynamics. Key steps include defining the states, employing time-series analysis, and utilizing visualization tools to interpret the results. The analysis helps in improving vehicle performance, enhancing safety, and optimizing operations.
  • #1
serbring
271
2
Hi all,

I have to analyse a dataset containing real-world vehicle trajectories and in particular:
1. The trajectories were classified into states in the function of certain vehicle parameters and location (urban roads, country roads, etc.) and each state is characterised by an integer number (i.e., 1, 2, 3, etc.) permitting me to obtain also a signal of categorical variable called "state".
2. Portions of vehicle trajectories were grouped when a certain sequence of the states occurred which is equivalent to a trip starting from a parking lot, then travelling through an urban road, then to a highway stretch, etc. Thus, a sequence might start with a sequence of 1s, followed by a sequence containing 100 times 2s, followed by a sequence of 3, and so on. This task was carried out by converting the sequence of numbers into a string and by setting a proper regular expression. It is a quick and dirty approach and it took a lot of time to tune the parameters of the regular expression and it is far from being perfect.

This is because there might be misclassification in the vehicle states (step 1) and in real-world conditions, operations are not always carried out in the same way (meaning that there might be some extra states in between for example because the driver took the wrong road and the duration of this extra state may change in duration). So, I need to find a better method. However, I have no idea which approach I can adopt. Do you have any suggestion to give me?

Thanks!
 
Technology news on Phys.org
  • #2
It might help to tell us what the purpose is. It's hard to suggest an approach when the desired result is unknown.
 
  • #3
As part of data analysis, its good to review the data and correct fields or discard questionable data items.

Each dataset has its own rules for validity and you may need to establish them. As an example, you might discard rows that have missing data or if possible fill in the missing info with nominal values.

Say you had a dataset for trains, planes and automobiles, you could validate any speed fields by applying some speed range criteria to identify rows where speeds are too high or too low for the type of vehicle being recorded.
 
  • Like
Likes FactChecker
  • #4
serbring said:
It is a quick and dirty approach and it took a lot of time
This statement seems to contradict itself

serbring said:
Do you have any suggestion to give me?
Don't use a regular expression.

Implement a parser in an object-oriented language using the state pattern.
 
Last edited:
  • #5
serbring said:
This task was carried out by converting the sequence of numbers into a string and by setting a proper regular expression.
This seems odd. It would seem both easier and faster just to check the sequences of numbers directly.
 
  • #6
I routinely used regular expressions to clean up, simplify, or filter inputs from imperfect sources before passing the inputs on to other algorithms. I considered it to be much easier and more reliable than the alternatives.
 
Last edited:
  • #7
thanks for your answers. I will answer to all your comments.
FactChecker said:
It might help to tell us what the purpose is. It's hard to suggest an approach when the desired result is unknown.

jedishrfu said:
As part of data analysis, its good to review the data and correct fields or discard questionable data items.

Each dataset has its own rules for validity and you may need to establish them. As an example, you might discard rows that have missing data or if possible fill in the missing info with nominal values.

Say you had a dataset for trains, planes and automobiles, you could validate any speed fields by applying some speed range criteria to identify rows where speeds are too high or too low for the type of vehicle being recorded.
The data were checked for their validity (e.g., no logged vehicle position for example, because the GPS did not fix the position) but of course the misclassification may occur and this can never be solved when dealing with a large dataset of real-world data. More specifically on the task, I am doing trip analysis where a trip can be a sequence of vehicle states (i.e., the vehicle start from a point (i.e., the house of the owner but a different place may occur), then travelling on extra-urban road, then a highway, then urban road, and finally back along to the same path. However, the driver may choose to travel through a different road to travel back home or he desided a detour and so on. This change may lead to different sequences of vehicle states complicating the detection of trips. Once trips are detect, I will calculate features of trips and I will analyse them. Hopefully, it is clearer now.


pbuk said:
This statement seems to contradict itself


Don't use a regular expression.

Implement a parser in an object-oriented language using the state pattern.
As reported by @FactChecker, for a quick and flexible pattern, a regular expression is a very quick approach even if it is not the best in terms of efficiency. However, when considering a large collection of real-world data, things get more complicated and therefore, now, I am searching a more advanced approach. I did not know state pattern and it might be interesting. I will dig into it but, as far as I understood, it is mostly a convenient way of using multiple if-statements so I am not fully sure if it might help in this case.

PeterDonis said:
This seems odd. It would seem both easier and faster just to check the sequences of numbers directly.
If you have any specific method to suggest me it might be great. Thanks.
 
  • #8
serbring said:
More specifically on the task, I am doing trip analysis where a trip can be a sequence of vehicle states (i.e., the vehicle start from a point (i.e., the house of the owner but a different place may occur), then travelling on extra-urban road, then a highway, then urban road, and finally back along to the same path. However, the driver may choose to travel through a different road to travel back home or he desided a detour and so on. This change may lead to different sequences of vehicle states complicating the detection of trips. Once trips are detect, I will calculate features of trips and I will analyse them. Hopefully, it is clearer now.
Are you saying that you are trying to detect round trips by the pattern of speeds, without having position data?
serbring said:
As reported by @FactChecker, for a quick and flexible pattern, a regular expression is a very quick approach even if it is not the best in terms of efficiency.
If the application fits, I think it would be hard to beat the efficiency of the built-in regular expressions. You don't say what language you are using. Python is very popular now, but it can be astonishingly slow. I used Perl a lot for such tasks and was not bothered by any lack of speed. Your job and amount of data may just require long runs. If the execution time is very long (several hours or days), you should look for ways to periodically save things so that you can monitor progress and restart the program where it left off. Things like power "glitches", unplanned system resets, unexpected data inputs that are being handled wrong, etc. can force you to restart the program from the beginning or at some intermediate stage.
 

FAQ: How to analyse a sequence of vehicle states?

What is a vehicle state sequence?

A vehicle state sequence refers to a chronological series of data points that represent the various states of a vehicle over time. These states can include information such as position, speed, acceleration, fuel level, engine status, and other operational parameters that describe the vehicle's behavior and performance during a specific period.

What tools can be used to analyze vehicle state sequences?

Several tools can be used for analyzing vehicle state sequences, including programming languages like Python and R, which offer libraries for data manipulation and analysis. Additionally, software platforms such as MATLAB, Tableau, and specialized vehicle telematics systems can provide visualization and in-depth analysis capabilities. Machine learning frameworks like TensorFlow and PyTorch can also be employed for predictive analytics based on state sequences.

What are the key metrics to consider when analyzing vehicle states?

Key metrics to consider when analyzing vehicle states include average speed, fuel efficiency, acceleration patterns, braking frequency, idle time, and route optimization. Additionally, analyzing the frequency and duration of specific states, such as stops or high-speed segments, can provide insights into driving behavior and vehicle performance.

How can I visualize vehicle state sequences effectively?

Effective visualization of vehicle state sequences can be achieved using time-series plots, scatter plots, and heatmaps to represent various metrics over time. Tools like Matplotlib and Seaborn in Python can help create these visualizations. Additionally, interactive dashboards using frameworks like Dash or Tableau can allow users to explore the data dynamically, highlighting trends and anomalies in the vehicle states.

What are common challenges in analyzing vehicle state sequences?

Common challenges in analyzing vehicle state sequences include dealing with missing or noisy data, ensuring data consistency across different sources, and managing large volumes of data generated by modern vehicles. Additionally, accurately interpreting the context of the data, such as distinguishing between normal driving behavior and anomalies, can be complex and often requires domain expertise.

Back
Top