• polyploy@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    111
    arrow-down
    1
    ·
    5 days ago

    God damn this is bleak.

    Mitch says the first signs of a deepening reliance on AI came when the company’s CEO was found to be rewriting parts of their app so that it would be easier for AI models to understand and help with. “Then”, Mitch says, “I had a meeting with the CEO where he told me he noticed I wasn’t using the Chat GPT account the company had given me. I wasn’t really aware the company was tracking that”.

    “Anyway, he told me that I would need to start using Chat GPT to speed up my development process. Furthermore, he said I should start using Claude, another AI tool, to just wholesale create new features for the app. He walked me through setting up the accounts and had me write one with Claude while I was on call with him. I’m still not entirely sure why he did that, but I think it may have been him trying to convince himself that it would work.”

    Mitch describes this increasing reliance on AI to be not just “incredibly boring”, but ultimately pointless. “Sure, it was faster, but it had a completely different development rhythm”, they say. “In terms of software quality, I would say the code created by the AI was worse than code written by a human–though not drastically so–and was difficult to work with since most of it hadn’t been written by the people whose job it was to oversee it”.

    “One thing to note is that just the thought of using AI to generate code was so demotivating that I think it would counteract any of the speed gains that the tool would provide, and on top of that would produce worse code than I didn’t understand. And that’s not even mentioning the ethical concerns of a tool built on plagiarism.”

    • Pennomi@lemmy.world
      link
      fedilink
      English
      arrow-up
      52
      arrow-down
      1
      ·
      5 days ago

      Code written by AI is really poorly written. A couple smells I’ve noticed:

      • Instead of fixing error cases, it overly relies on try:catch structures, making critical bugs invisible but still present. Dangerous shit.
      • It doesn’t reuse code that already exists in the project. You have to do a lot of extra work letting it know that your helper functions or CSS utility classes exist.
      • It writes things in a very “brute force” way. If it finds any solution, it charges forward with the implementation even if there is a much simpler way. It never thinks “but is there a better way?”
      • Likewise, it rarely uses the actual documentation for your library. It goes entirely off of gut instincts. Half the time if you paste in a documentation page, it finally shapes up and builds the code right. That should be default behavior.
      • It has a string tendency to undo manual changes I have made, because it doesn’t know the reasons why I did them.

      On the other hand, if you’re in a green field project and need to throw up some simple, dirty CSS/HTML for a quick marketing page, sure, let the AI bang it out. Some projects don’t need to be done well, they just need to be done fast.

      And the autocomplete features can be a time saver in some cases regardless.

      • svtdragon@lemmy.world
        link
        fedilink
        English
        arrow-up
        26
        arrow-down
        2
        ·
        edit-2
        5 days ago

        I just spent about a month using Claude 3.7 to write a new feature for a big OSS product. The change ended up being about 6k loc with about 14k of tests added to an existing codebase with an existing test framework for reference.

        For context I’m a principal-level dev with ~15 years experience.

        The key to making it work for me was treating it like a junior dev. That includes priming it (“accuracy is key here; we can’t swallow errors, we need to fail fast where anything could compromise it”) as well as making it explain itself, show architecture diagrams, and reason based on the results.

        After every change there’s always a pass of “okay but you’re violating the layered architecture here; let’s refactor that; now tell me what the difference is between these two functions, and shouldn’t we just make the one call the other instead of duplicating? This class is doing too much, we need to decompose this interface.” I also started a new session, set its context with the code it just wrote, and had it tell me about assumptions the code base was making, and what failure modes existed. That turned out to be pretty helpful too.

        In my own personal experience it was actually kinda fun. I’d say it made me about twice as productive.

        I would not have said this a month ago. Up until this project, I only had stupid experiences with AI (Gemini, GPT).

        • Pennomi@lemmy.world
          link
          fedilink
          English
          arrow-up
          15
          ·
          5 days ago

          Agreed. I use it in my daily workflow but you as the senior developer have to understand what can and cannot be delegated, and how to stop it from doing stupid things.

          For instance when I work in computer vision or other math-heavy code, it’s basically useless.

        • FarceOfWill@infosec.pub
          link
          fedilink
          English
          arrow-up
          13
          ·
          5 days ago

          Typically working with a junior on a project is slower than not working with them. It’s a little odd you see this as like that and that it’s also faster.

          • Quik@infosec.pub
            link
            fedilink
            English
            arrow-up
            12
            ·
            5 days ago

            I don’t think it’s odd, because LLMs are just way faster than any junior (or senior) Dev. So it’s more like working with four junior devs but with the benefit of having tasks done sequentially without the additional overhead of having to give tasks to individual juniors and context switching to review their changes.

            (Obviously, there are a whole lot of new pitfalls, but there a real benefits in some circumstances)

          • svtdragon@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            5 days ago

            The PR isn’t public yet (it’s in my fork) but even once I submit it upstream I don’t think I’m ready to out my real identity on Lemmy just yet.

      • AA5B@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        4 days ago

        It doesn’t reuse code that already exists in the project

        I had a pissing contest with one of the junior guys over this. He didn’t seem to understand why we should use the existing function and had learned so little about the code base that he didn’t know where to find it. He’s gone

        The more interesting flaw in his ai code was it hallucinated an entirely different mocking tool for unit tests

      • BeigeAgenda@lemmy.ca
        link
        fedilink
        English
        arrow-up
        11
        ·
        5 days ago

        Sounds about right, I had a positive experience when I told my local LLM to refactor a function and add a single argument.

        I would not dare letting it loose on a whole source file, because it changes random things giving you more code to review.

        In my view current LLM’s do a acceptable job with:

        • Adding comments
        • Writing docstrings
        • Writing git commit messages
        • Simple tasks on small pieces of code
      • morrowind@lemmy.ml
        link
        fedilink
        English
        arrow-up
        10
        ·
        5 days ago

        Yeah likewise. I think it shows the primary weakness of Llms right now is not skill or understanding, but context.

        It can’t use functions or docs that it doesn’t know about. Neither can I. RAG systems are supposed to solve this but in practice they don’t seem to work that great

  • oakey66@lemmy.world
    link
    fedilink
    English
    arrow-up
    67
    ·
    5 days ago

    My consulting company is literally talking about nothing else. It’s fucking awful.

    • Victor@lemmy.world
      link
      fedilink
      English
      arrow-up
      24
      ·
      5 days ago

      Mine also mentioned it on the last company retreat. That it’s important to look into using AI tools and not get “left behind”. Old geezers who don’t code anymore who think this is something we want to work with.

      I’m fine with using AI as some sort of dynamic snippets tool where I myself know what I want the code to look like in the end, and where you don’t have to predefine those snippets. But not to write entire apps for me.

      I don’t even use regular dumb snippets. It’s so easy to write code without them, why should I dumb myself down.

      • oakey66@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        4 days ago

        I’m in IT consulting. I have personally done some really cool shit for my clients. Things they didn’t have the talent to do themselves. Business management consulting and tax audit consulting is a completely different story. I don’t help automate away jobs. I’m not presenting decks to strip companies and governments for parts. Needless to say, not all consulting is created equally and my hope is that there comes a time where this bubble bursts this push for AI dies on the vine.

          • oakey66@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            4 days ago

            We help them build solutions that they then maintain and own. I’m in analytics. So we’re doing data engineering, security, and delivery.

            • oakey66@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              4 days ago

              Just to add the difference. Managed solutions typically has the consulting firm managing the maintenance. In some cases, they take over an existing solution vs the consulting company building something.

  • metaldream@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    58
    arrow-down
    1
    ·
    5 days ago

    Yeah mine is demanding that we prove that we’re using copilot to generate code.

    I wouldn’t care so much if copilot wasn’t a hot pile of garbage. It’s incredibly slow, constantly fails halfway through generating a file, can only handle updating or creating one file at a time, and the code it comes up with always needs manual correction to actually work.

    I mainly use it to generate unit tests and it frequently makes shit up that clearly won’t work. Like directly invoking non-exported functions that I deliberately choose not to export, because they don’t need to be exported.

    90% of the time it actually takes me longer to use copilot. The only thing it’s good is autocomplete.

    • drspod@lemmy.ml
      link
      fedilink
      English
      arrow-up
      30
      ·
      5 days ago

      prove that we’re using copilot to generate code

      How do they expect you to do that? And are they capable of telling the difference if you lie about it?

      • Nalivai@lemmy.world
        link
        fedilink
        English
        arrow-up
        28
        ·
        5 days ago

        Oh, that’s easy to check, if the code is barely functional nonsensical pile of shit, then it was autogenerated.

      • metaldream@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        17
        ·
        5 days ago

        I asked the same question in our meeting about it. They apparently have metrics tracking as part of their contract for copilot. But they also want to see demos etc. Basically forcing us to do busy work and increase stress so the execs can justify blowing money and political capital on AI.

      • 4am@lemm.ee
        link
        fedilink
        English
        arrow-up
        10
        ·
        5 days ago

        I’m sure it logs requests somewhere and they can just check

    • Blackmist@feddit.uk
      link
      fedilink
      English
      arrow-up
      18
      ·
      5 days ago

      The only thing it’s good is autocomplete.

      Which isn’t surprising because that’s all most “AI” is…

    • utopiah@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      ·
      4 days ago

      demanding that we prove that we’re using copilot to generate code

      Because that’s how one gets value right? It’s not because the tool is so efficient people WANT to use it even though they should not (e.g. due to commercial secrets)… no instead you force them to use it. /s

      • AA5B@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        3
        ·
        4 days ago

        To be fair, it’s a huge cost to the company and they need to justify that there is value. Forcing you to make regular use may be an petty, but getting you to learn it improves your skill base and getting you to use it justifies the cost

    • anotherandrew@lemmy.mixdown.ca
      link
      fedilink
      English
      arrow-up
      12
      ·
      5 days ago

      I mainly use it to generate unit tests and it frequently makes shit up that clearly won’t work. Like directly invoking non-exported functions that I deliberately choose not to export, because they don’t need to be exported.

      If you work where I work, their solution is to just #include "the_file.c" so they have access to all the functions/variables I painstakingly marked static specifically to prevent them from trying to unit test the internals.

      • metaldream@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        3
        ·
        5 days ago

        Lol, that’s ridiculous. At least here they aren’t requiring us to do that (yet).

        In my case, this is modular js code. Copilot doesn’t know how to test the internals of the module so it just calls them directly, which would cause an exception because the function would be undefined.

        I end up rewriting the test myself whenever it does this because there’s no reason to export those functions, it would just cause problems. Typically all this requires is changing the test inputs so the internal logic is covered by the test. It’s just too dumb to know that, because it doesn’t actually understand the code at all.

    • IllNess@infosec.pub
      link
      fedilink
      English
      arrow-up
      3
      ·
      5 days ago

      How do you prove that you are using copilot?

      Is there a way to reference your chat history?

      • heavydust@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 days ago

        I would put broken commits with a message saying it comes from the LLMs, then revert the whole stuff in another commit to fix the crap.

  • Curious Canid@lemmy.ca
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    3
    ·
    5 days ago

    An LLM does not write code. It cobbles together bits and pieces of existing code. Some developers do that too, but the decent ones look at existing code to learn new principles and then apply them. An LLM can’t do that. If human developers have not already written code that solves your problem, an LLM cannot solve your problem.

    The difference between a weak developer and an LLM is that the LLM can plagiarize from a much larger code base and do it much more quickly.

    A lot of coding really is just rehashing existing solutions. LLMs could be useful for that, but a lot of what you get is going to contain errors. Worse yet, LLMs tend to “learn” how to cheat at their tasks. The code they generate often has lot of exception handling built in to hide the failures. That makes testing and debugging more difficult and time-consuming. And it gets really dangerous if you also rely on an LLM to generate your tests.

    The software industry has already evolved to favor speed over quality. LLM generated code may be the next logical step. That does not make it a good one. Buggy software in many areas, such as banking and finance, can destroy lies. Buggy software in medical applications can kill people. It would be good if we could avoid that.

    • demizerone@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      5 days ago

      I am at a company that is forcing devs to use AI tooling. So far, it saves a lot of time on an already well defined project, including documentation. I have not used it to generate tests or to build a green field project. Those are coming tho as we have been told by management that all future projects should include AI components in some way. Coolaid has been consumed deeply.

      • AA5B@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 days ago

        I think of ai more as an enhanced autocomplete. Instead of autocompleting function calls, it can autocomplete entire lines.

        Unit tests are fairly repetitive, so it does a decent job of autocompleting those, needing only minor corrections

        I’m still up in the air over regexes. It does generate something but I’m not sure it adds value

        I haven’t had much success with the results of generating larger sections of code

    • pixxelkick@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      5 days ago

      Same, but they did set up a self hosted instance for us to use and, tbh, it works pretty good.

      I think it’s s good tool specifically for helping when you dunno what’s going on, to help with brainstorming or exploring different solutions. Getting recommended names of tools, finding out “how do other people solve this”, generating documentation, etc

      But for very straightforward tasks where you already know what you are doing, it’s not helpful, you already know what code you are going to write anyways.

      Right tool for the right job.

      • shortrounddev@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        5 days ago

        I use it as a form of google, basically. I ask it coding questions a lot, some of which are a bit more philosophical. I never allow it to write code for me, though. Sometimes I’ll have it check my work

  • ☂️-@lemmy.ml
    link
    fedilink
    English
    arrow-up
    31
    ·
    edit-2
    5 days ago

    malicious compliance, folks.

    let the ai ruin their codebase, get paid to fix it again.

  • HyonoKo@lemmy.ml
    link
    fedilink
    English
    arrow-up
    26
    ·
    5 days ago

    I‘m very glad the company where I work has explicitly prohibited the use of AI for coding, in fear of spilling their company secrets.

  • FearfulSalad@ttrpg.network
    link
    fedilink
    English
    arrow-up
    24
    ·
    edit-2
    5 days ago

    Preface: I have a lot of AI skepticism.

    My company is using Cursor and Windsurf, focusing on agent mode (and whatever Windsurf’s equivalent is). It hallucinates real hard with any open ended task, but when you have ALL of:

    • an app with good preexisting test coverage
    • the ability to run relevant tests quickly (who has time to run an 18 hour CI suite locally for a 1 line change?)
    • a well thought out product use case with edge cases

    Then you can tell the agent to write test cases before writing code, and run all relevant tests when making any code changes. What it produces is often fine, but rarely great. If you get clever with setting up rules (that tell it to do all of the above), you can sometimes just drop in a product requirement and have it implement, making only minor recommendations. It’s as if you are pair programming with an idiot savant, emphasis on idiot.

    But whose app is well covered with tests? (Admittedly, AI can help speed up the boilerplating necessary to backfill test cases, so long as someone knows how the app is supposed to work). Whose app is well-modularized such that it’s easy to select only downstream affected tests for any given code change? (If you know what the modules should be, AI can help… But it’s pretty bad at figuring that out itself). And who writes well thought out product use cases nowadays?

    If we were still in the olde waterfall era, with requirements written by business analysts, then maybe this could unlock the fabled 100x gains per developer. Or 10x gains. Or 1.1x gains, most likely.

    But nowadays it’s more common for AI to write the use cases, hallucinate edge cases that aren’t real, and when coupled with the above, patchwork together an app that no one fully understands, and that only sometimes works.

    Edit: if all of that sounds like TDD, which on its own gives devs a speed boost when they actually use it consistently, and you wonder if CEOs will claim that the boosts are attributable to AI when their devs finally start to TDD like they have been told to for decades now, well, I wonder the same thing.

    • NuXCOM_90Percent@lemmy.zip
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      1
      ·
      edit-2
      5 days ago

      The thing to understand is that it is not about improving developer efficiency. It is about improving corporate profits.

      Because that engineer using “AI”? If they are doing work that can be reliably generated by an AI then they aren’t a particularly “valuable” coder and, most likely, have some severe title inflation. The person optimizing the DB queries? They are awesome. The person writing out utility functions or integrating a library? And, regardless, you are going to need code review that invariably boils down to a select few who actually can be trusted to think through the implications of an implementation and check that the test coverage was useful.

      End result? A team of ten becomes a team of four. The workload for the team leader goes up as they have to do more code review themselves but that ain’t Management’s problem. And that team now has saved the company closer to a million a year than not. The question isn’t “Why would we use AI if it is only 0.9x as effective as a human being?” and instead “Why are we paying a human being a six figure salary when an AI is 90% as good and we pay once for the entire company?”

      And if people key in on “Well how do you find the people who can be trusted to review the code or make the tickets?”: Congrats. You have thought about this more than most Managers.

      My company hasn’t mandated the use of AI tools yet but it is “strongly encouraged” and we have a few evangelists who can’t stop talking about how “AI” makes them two or three times as fast and blah blah blah. And… I’ve outright side channeled some of the more early career staff that I like and explained why they need to be very careful about saying that “AI” is better at their jobs than they are.

      And I personally make it very clear that these tools are pretty nice for the boiler plate code I dislike writing (mostly unit tests) but that it just isn’t at the point where it can handle the optimizations and design work that I bring to the table. Because stuff is gonna get REALLY bad REALLY fast as the recession/depression speeds up and I want to make it clear that I am more useful than a “vibe coder” who studied prompt engineering.

      • taladar@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        2
        ·
        5 days ago

        “Why are we paying a human being a six figure salary when an AI is 90% as good and we pay once for the entire company?”

        And if it actually was 90% as good that would be a valid question, in reality however it is more like 9% as good with occasional downwards spikes towards 0.9%.

        • jwmgregory@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          9
          ·
          5 days ago

          you work in technology, presumably. so you’re supposedly an engineer of sorts.

          what kind of engineer says obviously wrong statements based on their feelings?

          i’m willing to provide different sources and discussion if you object to that one for some reason but virtually all facets of research agree current artificial intelligence performance is nothing like what you are suggesting. what you’re claiming just isn’t true and you are spreading misinformation. it’s okay to be scared but it’s not okay to lie.

          • taladar@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            8
            arrow-down
            1
            ·
            5 days ago

            I am not scared, well, except scared that I will have to listen to AI scam BS for the next decade the same way I had to listen to blockchain/cryptocurrency scam BS for the last decade.

            It is not that I haven’t tried the tools either. They just produce extremely horrible results every single time.

  • SocialMediaRefugee@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    5 days ago

    I can’t even get it to help with configurations most of the time. It gives commands that don’t exist in that OS version, etc.

      • superkret@feddit.org
        link
        fedilink
        English
        arrow-up
        8
        ·
        5 days ago

        With Slackware, that issue doesn’t exist.
        Configuration of the current version works exactly the same now as it did when the holy GNU revealed the ten commandmans.

  • Lovable Sidekick@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    4 days ago

    Not in the work force anymore but these accounts remind me of other influences that were foisted on me and my coworkers over the span of my software career. A couple I remember by name were Agile and Yourdon Structured Design, but there were a bunch more.

    In the old days somebody in management would attend a seminar or get a sales presentation or something and come back with a new “methodology” we were supposed to use. It typically took the form of a stack of binders full of documentation, and was always going to make our productivity “skyrocket”. We would either follow some rigorous process or just go through the motions, or something in between, and in say 6 months to a year the manager would have either left the company or forgotten all about it.

    It sounds like today’s managers are cut from about the same mold as always, and AI is yet another shiny object being dangled in front of them.