Wednesday, June 4, 2025

Link to CMG’25 Hackathon guidelines

CMG’25 Hackathon guidelines 

Below is one example from the document of using Performalist.com API:
The result (hacks) could be sent to trutechdev@gmail.com 

CMG’25 Hackathon guidelines 

ANNONCE (to publish before conference)

The task is to find some change points and/or anomalies in the given time stamped data to see different phases/patterns .


Participants can use any tools or libraries/packages (in R, Python and so on) to detect change points and/or anomalies in the data.


Or they can use the free Change Points detection API (e.g. via Postman.com) described in 

Change Point Detection is implemented in the free web tool Perfomalist

Particularly the instruction on how to use that is described HERE.


To visualize the result any spreadsheet charting could be used or other means (e.g. python, R and so on). Examples are on the following picture: 


The time to work on the task - 3 hours. 

The result would be judged by CMG experts and the winner will get an award and time to make a short presentation. 


Vendors of similar tools are welcome to participate. 


If participants would like to use MATLAB for anomaly detection tools, we have a licensed version available here:


URL for event page: https://www.mathworks.com/licensecenter/classroom/4866200/

1. this will take you to a MathWorks account Sign-in page.

2. Create an account or use an existing account

3. Press "Access MATLAB Online" button followed by "Open MATLAB Online"



On site activities


Data to test: https://github.com/numenta/NAB/tree/master/data

Particularly the following csv files:

LINK TO THE FOLDER WITH DATA IS HERE


EXAMPLE 1

Simple case: art_daily_flatmiddle.csv


Tool is performalist.com Change Point detection API described HERE


1st step to change the format of the data (using EXCEL or Google sheet means):

From original:


imestamp,value

2014-04-01 00:00:00,-21.0483826823

2014-04-01 00:05:00,-20.2954768676

2014-04-01 00:10:00,-18.127229468299998

2014-04-01 00:15:00,-20.1716653997

TO performalist form:


date,time,value

2014-04-01,0:00:00,-21.04838268

2014-04-01,0:05:00,-20.29547687

2014-04-01,0:10:00,-18.12722947

2014-04-01,0:15:00,-20.1716654


Then call perfomalist API (e.g. using postman.com)


The result is one change point on 2014-04-11, which can be easily validated by building the spreadsheet chart (see below):

Chart


EXAMPLE 2

More difficult case: https://github.com/numenta/NAB/blob/master/data/realTraffic/TravelTime_451.csv


After repeating the above steps (see Example 1)  the result should show several change points:


To reduce the number of change points one can explicitly to provide as a 1st 3 lines in the data the following tuning parameters:

  • sValue - Statistical band in %, where 100 is UCL=MAX, 0 is UCL=LCL=mean). (normality)

  • eValue - Exception Value (EV) threshold in % of actual historical average. (insensitivity)

  • BaseLineLength - The time period to compare current value against.

After adding there

 sValue, 99

eValue, 20

BaseLineLength , 7

The API returns  only 3 change points:

Putting the API output to EXCEL or Google sheet one can visualize the result by showing phases in the data between change points (see below):




Saturday, April 26, 2025

Anomaly/change detection hackathon proposal

ANNOUNCEMENT 

Objective: The goal of this hackathon is to identify change points and/or anomalies within a provided time-stamped dataset. This analysis will reveal distinct phases and patterns within the data.


Tools and Methods:

  • Participants are encouraged to utilize any tools, libraries, or packages (e.g., R, Python) for change point and anomaly detection.

  • Alternatively, participants can use the free Change Point Detection API via Postman (postman.com), as detailed below.

Resources:


Hackathon Details:

  • Time Limit: 3 hours to complete the task.

  • Judging: CMG experts will evaluate the results.

  • Award: The winner will receive an award and the opportunity for a short presentation.

  • Vendor Participation: Vendors of similar tools are welcome to participate.


Visualization:

  • Results can be visualized using spreadsheet charting tools (Excel, Google Sheets) or other methods (e.g., Python, R).

  • Examples of visualizations will be provided below:

Friday, November 22, 2024

"Detecting Past and Future Change Points in Performance Data" - research paper preprint

Our research paper was accepted for ORAL PRESENTATION at ICTDsC 2024 in India.

We were not able to go there and plan to publish that later at another place. The abstract is below.

And the paper itself could be found as a preprint at our google drive HERE


Wednesday, January 11, 2023

Perfomalist team is presenting at www.CMGimpact.com international conference in Orlando.


www.CMGimpact.com

LinkedIn Post

ABSTRACT: The MASF/SETDS method of detecting changes and anomalies in performance data, its recent implementation and the way to use and interpret the results will be presented with real examples against MangoDB testing data.

 

Since 1995, a time when the CMG conference published a very influential paper about the MASF method of anomaly detection, this topic has been increasingly more popular in the area of Capacity and Performance management. The modification of the MASF – SETDS method – was introduced in the 2002 CMG best paper and then got implemented in several companies and applications. This included the CMG online class “Perfomalies (Performance Anomaly) Detection”. Most recently this method has been turned into the cloud based serverless API microservice which is available for free via “Perfomalist” web app. The Perfomalist change points detection API was used against MongoDB’s testing data and got acceptable results which were published in the SPEC.org 2022 conference. This paper included an additional post-processing algorithm (XGBoost) to eliminate false positives which is planned to be added to Perfomalist service.


Friday, March 4, 2022

Perfomalist #ChangeDetection API was used against #MongoDB #perfomanceTesting dataset

We are participating in the data challenge for icpe2022.spec.org conference.

The challenge dataset is provided by MongoDB.

Initially some small part of the data was used to prove that Perfomalist CPD API can be used. 

Data looks like a big data cube with numerous dimensional variables and two factual ones (datetime and value). I took one case with a particular slice of this cube and processed that (datetime-value) by calling the Perfomalist API. The result I have plotted using Excel and can be seen in the following  picture. 




IDEA: Potentially some program could be developed to call the CPD API  (i.e.,  Perfomalist) for every data cube slice and to collect change points in a separate table like in the 2nd picture below:


That (meta-) data then should be correlated with events happening (or not happening) around any change dates detected, e.g., feature flag tuned on/off (that data is hidden from us so far). The result should help to explain each change. Additionally, to measure the magnitude of the change I would suggest calculating the entropy based imbalance of the data between changes (see my last paper how to do that). For example, that could tell how stable or not stable performance had become after particular change. 

After my 1st initial Peorfomalist usage, more rigorous usage was done against MongoDB dataset, based on which the following paper was written and accepted for data challenge track of the conference:

"Change Point Detection for MongoDB Time Series Performance Regression" paper for ACM/SPEC ICPE 2022 Data Challenge Track


Monday, January 10, 2022

Perfomalist Release Notes

- Perfomalist 1.1. has now the Change Point Detection API as described in the previous post:


The Change Points Detection Perfomalist API beta version is released. 


Contributors:  Arvid Trubin

Filipp Trubin

 
 

- Perfomalist 1.2. has additional two columns in the table view of the weekly profile to underline two types of anomalies the tool detects: 


    High Anomaly - Unusual high data value for particular hour calculated as Actual - UCL95 (only positive values of the subtraction is populated and represents EV+ which is Exception Value/significance of the anomaly) 

    Low Anomaly - Unusual low data value for particular hour calculated as UCL5 - Actual (only positive values of the subtraction is populated and represents EV- which is Exception Value /significance of the anomaly) 

If the value of Low or/and High Anomaly is "0" the particular hour does not have any anomalies. 
The number of anomalous week hours also counted and printed at the header of the columns in "()".

Contributor: Michael Berdichevsky



Link to CMG’25 Hackathon guidelines

CMG’25 Hackathon guidelines   Below is one example from the document of using Performalist.com API: The result (hacks) could be sent to  tr...