Waterlix Inc.
Waterlix Inc.
Cambridge, Ontario, Canada
Published

Time Series Data Analysis

1) Project description : Water Utilities track water usage in pumping stations using SCADA systems. These systems generate a lot of time series data which indicates key information they need about their operation. Flow, Pressure and chemical details about the water are some of these factors that might be recorded as a time series data. The Goals of the project is to find anomalies in pressure and flow values during the day and relate each anomaly to a certain cause. This analysis would be a great help to better manage water systems and provide early warning for water leakage in their distribution system. The time required to do the job depends on the skill level of students and their understanding about various ways to identify anomalies. The student should go beyond simple methods for anomaly detection and it requires to test a few different approaches to find an acceptable solution. This is a semi-supervised learning problem. i.e. We have clues about a period with anomaly although we don't know when the anomaly is actually happening and we want to find the moment that it happens. 2) Data Description: Provide information on the data you will share, including variables, type, granularity, time-period, number of records, and approximate total size in bytes. Provide information on how you will share this data, including mode of access, platforms, frequency, and format. Confirm that you will be able to provide access to this data by the required date. The data is in csv format and we have around 3 years of data for every 15 minutes. the flow and pressure are available. Data is anatomized although we want an agreement to be signed for using the data only for this project and not to share it with others or use outside of this project. These are the restrictions that we have from data providers which will remain anonymous. 3) Technology Description: Provide information on any specific platforms and tools that you might require the students to use for data analysis, and if so, confirm that you will be able to provide access to these by the start date of the project. Is there aren't any, say 'none' The R is the language that we expect the students to use. If you are a Python or other language fan, we hope to get enough description about the code to make sense of the algorithm you are using. i.e description for each module and the logic on key modules for anomaly detection process. 4) Preferred methods: Provide information on any specific methods or techniques you might require the students to use for data analysis; if there aren't any, say 'none' Selection of any technique depends on your creativity and your knowledge. You are completely free to use any approach as long as you could find the anomalies with a reasonable accuracy.

Admin Mehrdad Varedi
Matches 9
Closed