Large-scale Video Classification with Convolutional Neural Networks

Authors

  • Prathamesh Kshirsagar Computer Science, Bharati Vidyapeeth College Of Engineering, Lavale, Pune.
  • Pooja Nagawade Computer Science, Maharashtra Institute Of Technology, Pune

Keywords:

Acoustic Event Detection, Acoustic Scene Classification, Convolutional Neural Networks, Deep Neural Networks, Video Classification.

Abstract

Convolutional Neural Networks (CNNs) have acquired a strong reputation as an image recognition model
class. As a result of these findings, we give a comprehensive empirical evaluation of CNNs for large-scale video
classification using a fresh dataset of 1 million YouTube videos classified into 487 classes. We study a range of strategies
for extending the time domain connectivity of a CNN in order to use local spatio-temporal information. We discuss the
limitations of current training methods and propose a multiresolution, foveated architecture as a possible technique of
expediting training. When compared to strong feature-based networks, our top spatio-temporal networks outperform
them significantly. when compared to single-frame models (59.3 percent), however this is only a marginal improvement
(55.3 percent to 63.9 percent). 60.9 percent in total). We delve deeper on the generalisation performance. Retrain our best
model's top layers on the UCF101 Action Recognition dataset and observe considerable performance improvements over
the UCF-101 baseline. prototype (63.3 percent up from 43.9 percent ).

Published

2021-12-25

How to Cite

Large-scale Video Classification with Convolutional Neural Networks. (2021). International Journal of Advance Engineering and Research Development (IJAERD), 8(11), 1-5. https://www.ijaerd.org/index.php/IJAERD/article/view/4718

Similar Articles

1-10 of 1517

You may also start an advanced similarity search for this article.