A Hierarchical Taxonomy of 500 Categories Derived From 2.4k Search Queries for Improved Human Activity Understanding

In this section, we discuss the process of sampling and annotating the selected search queries to create a taxonomy for portrait-mode video analysis. We explain that we have approximately 2 to 50 associated search queries for each of the 500 candidate categories in the taxonomy, which are obtained from the original sources without storing or distributing videos.
To ensure ethical and legal considerations, we carefully organize the annotation task to provide reasonable workloads and just compensation for annotators while adhering to human rights principles. We also merge similar queries into a final leaf node category of the taxonomy, split or remove queries that may overlap with existing categories, and obtain about 500 candidate categories organized in a three-layer hierarchy.
To demystify complex concepts, we use everyday language and engaging metaphors or analogies to explain the process. For example, we compare the annotation task to cooking a meal, where each category is like a different dish with its unique ingredients (queries). We also liken the merging of similar queries to combining flavors in a recipe to create a delicious and balanced dish.
Throughout the summary, we strive for a balance between simplicity and thoroughness to capture the essence of the article without oversimplifying. By using simple language and relatable analogies, we aim to help readers understand complex concepts related to sampling and annotation in portrait-mode video analysis.

ARXIV/2312.13746 authored by Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang.

A Hierarchical Taxonomy of 500 Categories Derived From 2.4k Search Queries for Improved Human Activity Understanding

LLama 2 7B Chat

Categories

Tags

Archives

A Hierarchical Taxonomy of 500 Categories Derived From 2.4k Search Queries for Improved Human Activity Understanding

LLama 2 7B Chat

Accurate Analysis of Image Captions with CoT-Based Methods

Unsupervised Audio-Caption Alignment via Correspondence Learning

Efficient Method for ML Model Accuracy Improvement in Non-IID Data Settings

Categories

Tags

Archives