@QuadratureSurfer

QuadratureSurfer@lemmy.world · 5 days ago

Just reported by Mohamed Aruham on Twitter

The oldest tweets I could find that actually started reporting this are from ~16 days ago.

https://x.com/Piotrdotcom/status/1829126494574067992

They reference a page here that was posted on Aug 29th.

https://niebezpiecznik.pl/post/uwazajcie-na-takie-captcha/

QuadratureSurfer@lemmy.world · 8 days ago

Yeah until the cops pull you over and take your cash under civil asset forfeiture because it’s “suspicious that you have so much cash on hand”.

https://ij.org/press-release/highway-robbery-in-reno-nevada-cops-use-civil-forfeiture-to-steal-a-veterans-life-savings/

QuadratureSurfer@lemmy.world · 8 days ago

The features you miss out on would be direct deposit from checks and app notifications (usually there are a few that you want enabled but are only available through the app).

QuadratureSurfer@lemmy.world · 8 days ago

Good luck when banking apps start doing this.

QuadratureSurfer@lemmy.world · 10 days ago

I just want to be able to set alarms with their calendar app (where it currently only sends notifications).

QuadratureSurfer@lemmy.world · 12 days ago

Ok, but the most important part of that research paper is published on the github repository, which explains how to provide audio data and text data to recreate any STT model in the same way that they have done.

See the “Approach” section of the github repository: https://github.com/openai/whisper?tab=readme-ov-file#approach

And the Traning Data section of their github: https://github.com/openai/whisper/blob/main/model-card.md#training-data

With this you don’t really need to use the paper hosted on arxiv, you have enough information on how to train/modify the model.

There are guides on how to Finetune the model yourself: https://huggingface.co/blog/fine-tune-whisper

Which, from what I understand on the link to the OSAID, is exactly what they are asking for. The ability to retrain/finetune a model fits this definition very well:

The preferred form of making modifications to a machine-learning system is:

Data information […]

Code […]

Weights […]

All 3 of those have been provided.

QuadratureSurfer@lemmy.world · 13 days ago

I don’t understand. What’s missing from the code, model, and weights provided to make this “open source” by the definition of your first link? it seems to meet all of those requirements.

As for the OSAID, the exact training dataset is not required, per your quote, they just need to provide enough information that someone else could train the model using a “similar dataset”.

QuadratureSurfer@lemmy.world · 13 days ago

I did a quick check on the license for Whisper:

Whisper’s code and model weights are released under the MIT License. See LICENSE for further details.

So that definitely meets the Open Source Definition on your first link.

And it looks like it also meets the definition of open source as per your second link.

Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of the paper, as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.

QuadratureSurfer@lemmy.world · 13 days ago

The STT (speech to text) model that they created is open source (Whisper) as well as a few others:

https://github.com/openai/whisper

https://github.com/orgs/openai/repositories?type=all

QuadratureSurfer@lemmy.world · 15 days ago

Yep, exactly this. It might deter some small time bot creators, but it won’t stop larger operations and may even help them to seem more legitimate.

If anything, my favorite idea comes from this xkcd:

https://xkcd.com/810/

QuadratureSurfer@lemmy.world · 15 days ago

Easy way to get around that with “virtual” addresses: https://ipostal1.com/virtual-address.php

Just pay $10 for every account that you want to create… you may as well just go with the solution of charging everyone $10 to create an account. At least that way the instance owner is getting supported and it would have the same effect.