Image-Guided Object Detection using OWL-ViTand Enhanced Query Embedding Extraction (doi:10.7910/DVN/PRHQMK)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Image-Guided Object Detection using OWL-ViTand Enhanced Query Embedding Extraction

Identification Number:

doi:10.7910/DVN/PRHQMK

Distributor:

Harvard Dataverse

Date of Distribution:

2024-04-14

Version:

1

Bibliographic Citation:

Melih Serin, 2024, "Image-Guided Object Detection using OWL-ViTand Enhanced Query Embedding Extraction", https://doi.org/10.7910/DVN/PRHQMK, Harvard Dataverse, V1

Study Description

Citation

Title:

Image-Guided Object Detection using OWL-ViTand Enhanced Query Embedding Extraction

Identification Number:

doi:10.7910/DVN/PRHQMK

Authoring Entity:

Melih Serin (Boğaziçi University)

Distributor:

Harvard Dataverse

Access Authority:

Melih Serin

Depositor:

KUUJE

Date of Deposit:

2024-04-14

Holdings Information:

https://doi.org/10.7910/DVN/PRHQMK

Study Scope

Keywords:

Engineering, Open-Vocabulary Object Detection with Vision Transformers (OWL-ViT), Object Detection, Vision Transformers, End-to-End Training, Generalized Intersection over Union (gIoU) Loss

Abstract:

Computer vision has been receiving increasing attention with the recent complex Generative AI models released by tech industry giants, such as OpenAI and Google. However, there is a specific subfield that we wanted to focus on, that is, Image-Guided Object Detection. A detailed literature survey directed us towards a successful study called Simple Open-Vocabulary Object Detection with Vision Transformers (OWL-ViT) [1], which is a multifunctional complex model that can also perform image-guided object detection as a side function. In this study, some experiments have been conducted utilizing OWL-ViT architecture as the base model and manipulated the necessary parts to achieve a better one-shot performance. Code and models are available on GitHub.

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Other Study-Related Materials

Label:

ImageGuidedObjectDetection.pdf

Notes:

application/pdf