• 5 Posts
  • 1.07K Comments
Joined 6 年前
cake
Cake day: 2020年5月31日

help-circle

  • Ephera@lemmy.mltoProgrammer Humor@programming.devwhy?
    link
    fedilink
    English
    arrow-up
    12
    ·
    2 天前

    One time, I was staring at a piece of code for a solid 10 minutes or so, and could not understand why it gave me a compile error.
    So, I ask the senior for help, start explaining what I’ve been trying to do, scroll down to show some other code snippet, scroll back up and the compile error was gone. My IDE simply had not re-rendered properly. I have rarely sweared as much as in that moment.


  • This is a somewhat hacky solution, but I’ve set up a thing in the past, where I would share a URL to my desktop via KDE Connect. And then on my desktop, I configured the default browser to be a script that I wrote.
    This script would check, if the URL is a YouTube URL, and if so then open it via MPV (with yt-dlp also installed on the system).
    If not, then just open it in Firefox as normal.


  • That’s really not a good sign, though. A review process to check for basic sanity is just a bandaid fix for a lack of discipline, which ultimately requires more work to be done. So, the person that asked the magic pattern machine should review that code, as they should be deeper into the context of what needs to be done, and they know which parts of the code were generated and which parts they actually logically thought about.









  • Having to make a decision isn’t my primary issue here (even though it can also be problematic, when you need to serialize domain-specific data for which you’re no expert). My issue is rather in that you have to write this decision down, so that it can be used for deserializing again. This just makes XML serialization code significantly more complex than JSON serialization code. Both in terms of the code becoming harder to understand, but also just lines of code needed.
    I’ve somewhat come to expect less than a handful lines of code for serializing an object from memory into a file. If you do that with XML, it will just slap everything into child nodes, which may be fine, but might also not be.


  • Ah, well, as far as XML is concerned, yeah, these are very different things, but that’s where the problem stems from. In your programming language, you don’t have two variants. You just have (person (name "Alice") (age 30)). But then, because XML makes a difference between metadata and data, you have to decide whether “name” and “age” are one or the other.

    And the point I wanted to make, which perhaps didn’t come across as well, is that you have to write down that decision somewhere, so that when you deserialize in the future, you know whether to read these fields from attributes or from child nodes.
    And that just makes your XML serialization code so much more complex than it is for JSON, generally speaking. As in, I can slap down JSON serialization in 2 lines of code and it generally does what I expect, in Rust in this case.

    Granted, Rust kind of lends itself to being serialized as JSON, but well, I’m just not aware of languages that lend themselves to being serialized as XML. The language with the best XML support that I’m aware of, is Scala, where you can actually get XML literals into the language (these days with a library, but it used to be built-in until Scala 3, I believe): https://javadoc.io/doc/org.scala-lang.modules/scala-xml_2.13/latest/scala/xml/index.html
    But even in Scala, you don’t use a case class for XML, which is what you normally use for data records in the language, but rather you would take the values out of your case class and stick them into such an XML literal. Or I guess, you would use e.g. the Jackson XML serializer from Java. And yeah, the attribute vs. child node divide is the main reason why this intermediate step is necessary. Meanwhile, JSON has comparatively little logic built into the language/libraries and it’s still a lot easier to write out: https://docs.scala-lang.org/toolkit/json-serialize.html


  • Alright, I haven’t really looked into XML specifications so far. But I also have to say that needing a specification to consistently serialize and deserialize data isn’t great either.

    And yes, JSON not having attributes is what I’m saying is a good thing, at least for most data serialization use-cases, since programming languages do not typically have such attributes on their data type fields either.


  • IMHO one of the fundamental problems with XML for data serialization is illustrated in the article:

    (person (name "Alice") (age 30))
    [is serialized as]

    <person>
      <name>Alice</name>
      <age>30</age>
    </person>
    

    Or with attributes:
    <person name="Alice" age="30" />

    The same data can be portrayed in two different ways. Whenever you serialize or deserialize data, you need to decide whether to read/write values from/to child nodes or attributes.

    That’s because XML is a markup language. It’s great for typing up documents, e.g. to describe a user interface. It was not designed for taking programmatic data and serializing that out.





  • There should be an open-source recommendation algorithm, though; I’m sure of it.

    Problem is that the kind of algorithm you envision is technologically a black-box, not just by choice. It’s a machine learning model. At best, you could make the training data and instructions public, but it would still be hard to reason why it makes certain decisions. Corporations traditionally try to eliminate biases by throwing as much data at it as possible, but that makes it even harder to reason about it.

    I guess, maybe you could try to split the tasks. So, set up a list of e.g. 50 topics, such as sports, IT, politics etc… Then use a small language model to decide into which categories each post fits. And then you could let the user decide the weights for the topics + weights for recency and vote count.
    Or I guess, automatically decide the weights based on what the user upvotes and then make the weights transparent to each user.

    But yeah, I don’t think there’s prior art in this respect, so would probably need lots of experimenting still.