Velocirooster adminensis @theropologist

Recent searches

Search options

Only available when logged in.

**BedastGPT** @bedast · Jan 9 *

The enshittification of AI has lead to the choice of AI used by VLC to be groaned at. I even saw a post cross my feed of someone looking for a replacement for VLC.

VLC is working on on-device realtime captioning. This has nothing to do with generating images or video using AI. This has nothing to do with LLMs.

(edit: There's claims VLC is using a local LLM. It will use whisper.cpp, and not be using OpenAI's models. I don't know which models they will be using. I cannot find any reference to VLC using a LLM.)

While it would be preferred to use human generated captions for better accuracy, this is not always possible. This means a lot of video media is inaccessible to those with hearing impairment.

What VLC is doing is something that will contribute to accessibility in a big way.

AI transcription is still not perfect. It has its problems. But this is one of those things that we should be hoping to advance.

I'm not looking to replace humans in creating captions. I think we're very far from ever being able to do this correctly without humans. But as I said, there's a ton of video content that simply do not have captions available, human generated or not.

So long as they're not trying to manipulate the transcription using GenAI means, this is the wrong one to demonize.

#AI #Transcription #VLC

Moss Wizard @Moss@beige.party

@bedast It’s tricky because you’re certainly right about the amount of video with no captions, and the unfair inaccessibility of that. But translation AI is exactly the same tech as “generative” or “LLM”, it is statistical modeling. It is not different in any way, including errors and fuel and water demands. It’s like vehicle engines and tires: they do a tremendous amount of good every day, including for accessibility, but they also have terrible side effects that warrant complaints.

Jan 09, 2025, 04:34 PM··Mona for iPhone

0boosts·13favorites

**sbszine** @sbszine@dice.camp · Jan 10

Jan 10

sbszine @sbszine@dice.camp

@bedast @Moss If it's done on device that should address the water issue at least.

**BedastGPT** @bedast · Jan 10

Jan 10

BedastGPT @bedast

@sbszine @Moss Honestly, in my opinion, any AI inference that is not able to use on-device or edge compute is not ready for mass usage by the public.

There’s multiple AI and AI-adjacent tools that I use that have no reliance on cloud compute for inference or decision making. For example, my insulin pump’s operation to keep my blood glucose near target. This runs on a device the size of a pager.

**th4** @th4@post.lurk.org · Jan 10

Jan 10

th4 @th4@post.lurk.org

@sbszine @bedast @Moss as a rule of thumb, if it can run locally it's probably not too outrageously wasteful

Jan 10

Jan 10

its like i could have a leg in one hand and a brerb in the other @breadcat@app.wafrn.net

@sbszine @bedast @Moss where do you think your electricity comes from?

**waso nytpu** @nytpu@tilde.zone · Jan 10

Jan 10

waso nytpu @nytpu@tilde.zone

@sbszine @bedast @Moss The issue has pretty much never been the energy cost of using the model, but the energy cost of training it. And there's also the ethics of the sourcing of the training data as well.

**ElKowar** @elkowar@chaos.social · Jan 11

Jan 11

ElKowar @elkowar@chaos.social

@Moss @bedast it's the same tech except that transcription is several orders of magnitudes less energy intensive both in training, and ESPECIALLY in inference. You can trivially run very solid STT model on your mid-range smartphone without much issues. Running LLM inference on a phone at a comparable level of performance is absolutely non-viable currently. Training an STT model costs basically 0 energy compared to an LLM.

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back