Many people do not realize this, but YouTube is a search engine. In fact, it is the second-largest search engine in the world – second only to Google Search itself. As a result, millions of people end up on YouTube while doing a regular Google search, looking for a specific fact or detail. Of course, those of us who lead busy lives would rather that information be presented to us in easily-digestible bites already curated for us.
To solve this problem, last year YouTube introduced auto-generated chapters to make it easier to jump to the part in the video you’re most interested in. This year, Google has started applying multimodal technology from DeepMind to simultaneously use text, audio, and video to auto-generate chapters with more speed and accuracy. This technology now allows YouTube to increase the number of videos with these auto-generated chapters from the eight million they serve today to 80 million over the next year. This is a tenfold increase, which is also a huge win and time-saver for creators that will no longer need to create the chapters themselves.
Another time-saving feature YouTube viewers enjoy is video transcripts, which are often the best and fastest way to get a sense of what a video is about. Because of that, YouTube has started using speech recognition models to transcribe videos more accurately and has made them available to all Android and iOS users. Also, auto-translated captions, which were previously only available on the web, will now be available on mobile. This expansion means that viewers can now auto-translate video captions in 16 languages and help creators grow their global audience.
These improvements go a long way in helping both viewers and creators get the content they want and need without much fuss. As a key search engine, YouTube has a responsibility to improve discovery and ease of use for all of its viewers. I can not wait to see what other new features YouTube will bring come next year.
Featured Photo by NordWood Themes on Unsplash