A post from Amazon AWS : Transitioning from Amazon Rekognition people pathing: Exploring other alternatives

A post from Amazon AWS : Transitioning from Amazon Rekognition people pathing: Exploring other alternatives

Amazon Rekognition people pathing is a machine learning (ML)–based capability of Amazon Rekognition Video that users can use to understand where, when, and how each person is moving in a video. This capability can be used for multiple use cases, such as for understanding:

Retail analytics – Customer flow in the store and identifying high-traffic areas
Sports analytics – Players’ movements across the field or court
Industrial safety – Workers’ movement in work environments to promote compliance with safety protocols

After careful consideration, we made the decision to discontinue Rekognition people pathing on October 31, 2025. New customers will not be able to access the capability effective October 24, 2024, but existing customers will be able to use the capability as normal until October 31, 2025.

This post discusses an alternative solution to Rekognition people pathing and how you can implement this solution in your applications.

Alternatives to Rekognition people pathing

One alternative to Amazon Rekognition people pathing combines the open source ML model YOLOv9, which is used for object detection, and the open source ByteTrack algorithm, which is used for multi-object tracking.

Overview of YOLO9 and ByteTrack

YOLOv9 is the latest in the YOLO object detection model series. It uses a specialized architecture called Generalized Efficient Layer Aggregation Network (GELAN) to analyze images efficiently. The model divides an image into a grid, quickly identifying and locating objects in each section in a single pass. It then refines its results using a technique called programmable gradient information (PGI) to improve accuracy, especially for easily missed objects. This combination of speed and accuracy makes YOLOv9 ideal for applications that need fast and reliable object detection.

ByteTrack is an algorithm for tracking multiple moving objects in videos, such as people walking through a store. What makes it special is how it handles objects that are both straightforward and difficult to detect. Even when someone is partially hidden or in a crowd, ByteTrack can often still follow them. It’s designed to be fast and accurate, working well even when there are many people to track simultaneously.

When you combine YOLOv9 and ByteTrack for people pathing, you can review people’s movements across video frames. YOLOv9 provides person detections in each video frame. ByteTrack takes these detections and associates them across frames, creating consistent tracks for each individual, showing how people move through the video over time.

Example code

The following code example is a Python script that can be used as an AWS Lambda function or as part of your processing pipeline. You can also deploy YOLOv9 and ByteTrack for inference using Amazon SageMaker. SageMaker provides several options for model deployment, such as real-time inference, asynchronous inference, serverless inference, and batch inference. You can choose the suitable option based on your business requirements.

Here’s a high-level breakdown of how the Python script is executed:

Load the YOLOv9 model – This model is used for detecting objects in each frame.
Start the ByteTrack tracker – This tracker assigns unique IDs to objects and tracks them across frames.
Iterate through video frame by frame – For each frame, the script iterates by detecting objects, tracking path, and drawing bounding boxes and labels around them. All these are saved on a JSON file.
Output the processed video – The final video is saved with all the detected and tracked objects, annotated on each frame.

# install and import necessary packages
!pip install opencv-python ultralytics
!pip install imageio[ffmpeg]

import cv2
import imageio
import json
from ultralytics import YOLO
from pathlib import Path

# Load an official Segment model from YOLOv9
model = YOLO(‘yolov9e-seg.pt’) 

# define the function that changes YOLOV9 output to Person pathing API output format
def change_format(results, ts, person_only):
    #set person_only to True if you only want to track persons, not other objects.
    object_json = []

    for i, obj in enumerate(results.boxes):
        x_center, y_center, width, height = obj.xywhn[0]
        # Calculate Left and Top from center
        left = x_center – (width / 2)
        top = y_center – (height / 2)
        obj_name = results.names[int(obj.cls)]
        # Create dictionary for each object detected
        if (person_only and obj_name == “person”) or not person_only:
            obj_data = {
                obj_name: {
                    “BoundingBox”: {
                        “Height”: float(height),
                        “Left”: float(left),
                        “Top”: float(top),
                        “Width”: float(width)
                    },
                    “Index”: int(obj.id)  # Object index
                },
                “Timestamp”: ts  # timestamp of the detected object
            }
        object_json.append(obj_data)

    return object_json

#  Function for person tracking with json outputs and optional videos with annotation 
def person_tracking(video_path, person_only=True, save_video=True):
    # open the video file
    reader = imageio.get_reader(video_path)
    frames = []
    i = 0
    all_object_data = []
    file_name = Path(video_path).stem

    for frame in reader:
        # Convert frame from RGB (imageio’s default) to BGR (OpenCV’s default)
        frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        try:
            # Run YOLOv9 tracking on the frame, persisting tracks between frames with bytetrack
            conf = 0.2
            iou = 0.5
            results = model.track(frame_bgr, persist=True, conf=conf, iou=iou, show=False, tracker=”bytetrack.yaml”)

            # change detection results to Person pathing API output formats.
            object_json = change_format(results[0], i, person_only)
            all_object_data.append(object_json)

            # Append the annotated frame to the frames list (for mp4 creation)
            annotated_frame = results[0].plot()
            frames.append(annotated_frame)
            i += 1

        except Exception as e:
            print(f”Error processing frame: {e}”)
            break

    # save the object tracking array to json file
    with open(f'{file_name}_output.json’, ‘w’) as file:
        json.dump(all_object_data, file, indent=4)
   
     # save annotated video
    if save_video is True:
        # Create a VideoWriter object of mp4
        fourcc = cv2.VideoWriter_fourcc(*’mp4v’)
        output_path = f”{file_name}_annotated.mp4″
        fps = reader.get_meta_data()[‘fps’]
        frame_size = reader.get_meta_data()[‘size’]
        video_writer = cv2.VideoWriter(output_path, fourcc, fps, frame_size)

        # Write each frame to the video and release the video writer object when done
        for frame in frames:
            video_writer.write(frame)
        video_writer.release()
        print(f”Video saved to {output_path}”)

    return all_object_data
    
        
#main function to call 
video_path = ‘./MOT17-09-FRCNN-raw.webm’
all_object_data = person_tracking(video_path, person_only=True, save_video=True)

Validation

We use the following video to showcase this integration. The video shows a football practice session, where the quarter back is starting a play.

The following table shows an example of the content from the JSON file with person tracking outputs by timestamp.

Timestamp
PersonIndex
Bounding box

Height
Left
Top
Width

0
42
0.51017
0.67687
0.44032
0.17873

0
63
0.41175
0.05670
0.3148
0.07048

1
42
0.49158
0.69260
0.44224
0.16388

1
65
0.35100
0.06183
0.57447
0.06801

4
42
0.49799
0.70451
0.428963
0.13996

4
63
0.33107
0.05155
0.59550
0.09304

4
65
0.78138
0.49435
0.20948
0.24886

7
42
0.42591
0.65892
0.44306
0.0951

7
63
0.28395
0.06604
0.58020
0.13908

7
65
0.68804
0.43296
0.30451
0.18394

The video below show the results with the people tracking output

Other open source solutions for people pathing

Although YOLOv9 and ByteTrack offer a powerful combination for people pathing, several other open source alternatives are worth considering:

DeepSORT – A popular algorithm that combines deep learning features with traditional tracking methods
FairMOT – Integrates object detection and reidentification in a single network, offering users the ability to track objects in crowded scenes

These solutions can be effectively deployed using Amazon SageMaker for inference.

Conclusion

In this post, we have outlined how you can test and implement YOLOv9 and Byte Track as an alternative to Rekognition people pathing. Combined with AWS tool offerings such as AWS Lambda and Amazon SageMaker, you can implement such open source tools for your applications.

About the Authors

Fangzhou Cheng is a Senior Applied Scientist at AWS. He builds science solutions for AWS Rekgnition and AWS Monitron to provide customers with state-of-the-art models. His areas of focus include generative AI, computer vision, and time-series data analysis

Marcel Pividal is a Senior AI Services SA in the World- Wide Specialist Organization, bringing over 22 years of expertise in transforming complex business challenges into innovative technological solutions. As a thought leader in generative AI implementation, he specializes in developing secure, compliant AI architectures for enterprise- scale deployments across multiple industries.

Read More

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *