• Eufalconimorph@discuss.tchncs.de
    link
    fedilink
    arrow-up
    1
    ·
    2 days ago

    Not exactly a datastructure alone, but bitslicing is a neat trick to turn some variable-time operations into constant-time operations. Used in cryptography for “substitution box” (S-box) operations, which can otherwise leak secrets via data-dependent timing variations.

    The datastructure side of it is breaking up n words into bits and interleaving them within n variables (usually machine registers), so that the first variable contains the first bit from each word, the second variable the second bit, etc. It’s also called “SIMD within a register”.

  • duckythescientist@sh.itjust.works
    link
    fedilink
    arrow-up
    76
    ·
    5 days ago

    I’m also not sure if this is obscure, but Bloom Filters! It’s a structure that you can add elements to then ask it if it has seen the element before with the answer being either “no” or “probably yes”. There’s a trade-off between confidence of a “probably yes”, how many elements you expect to add, and how big the Bloom Filter is, but it’s very space and time efficient. And it uses hash functions which always make for a fun time.

    • lad@programming.dev
      link
      fedilink
      English
      arrow-up
      25
      ·
      5 days ago

      Relevant xkcd

      in Randall's words

      Sometimes, you can tell Bloom filters are the wrong tool for the job, but when they’re the right one you can never be sure.

    • FizzyOrange@programming.dev
      link
      fedilink
      arrow-up
      9
      ·
      5 days ago

      Obscure 10 years ago maybe. These days there have been so many articles about them I bet they’re more widely known than more useful and standard things like prefix trees (aka tries).

  • Trigger2_2000@sh.itjust.works
    link
    fedilink
    arrow-up
    5
    ·
    3 days ago

    Not really a data structure per say, but just knowing LISP and the interesting structures it uses internally.

    The results of LISP operations CAR, CDR, CADR and the other one I can’t remember now.

    • Pissmidget@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      5 days ago

      From just the name my mind instantly thought of the conflict as “conflict diamonds”, and I began to wonder what constitutes a conflict free boolean or integer.

      If anyone wants to take a crack at writing up why primitives are unfortunate, and we should move on to new “conflict free data types”™ I will cheer you on!

      Also, very interesting read about actual conflict free replicated days types. Cheers!

    • BackgrndNoize@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      5 days ago

      This sounds like document collaboration software like Google sheets where multiple people can edit a document at the same time

      • subversive_dev@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        5 days ago

        Yes exactly, collaborative editing is probably the number one use case people actually use them for. I doubt Google docs actually uses a real CRDT behind the scenes but they do have some big brains over there in the chocolate factory

  • JackbyDev@programming.dev
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 days ago

    B trees are cool but not obscure necessarily. I didn’t learn about them in college. It sounds like binary tree and it’s similar but it’s different. It’s a data structure to take advantage of the way disk reads work.

  • Vorpal@programming.dev
    link
    fedilink
    arrow-up
    23
    ·
    edit-2
    4 days ago

    XOR lists are obscure and cursed but cool. And not useful on modern hardware as the CPU can’t predict access patterns. They date from a time when every byte of memory counted and CPUs didn’t have pipelines.

    (In general, all linked lists or trees are terrible for performance on modern CPUs. Prefer vectors or btrees with large fanout factors. There are some niche use cases still for linked lists in for example kernels, but unless you know exactly what you are doing you shouldn’t use linked data structures.)

    EDIT: Fixed spelling

  • dustletter@piefed.blahaj.zone
    link
    fedilink
    English
    arrow-up
    8
    ·
    4 days ago

    Skew binary trees. They’re an immutable data structure combining the performance characteristics of lists (O(1) non-amortized push/pop) and b-trees (log(N) lookup and updates)
    They use a sequence of complete trees, cleverly arranged using skew binary numbers so that adding an element never causes cascading updates.
    In practice they’re superseded by relaxed radix balanced trees.

  • Marc@sueden.social
    link
    fedilink
    arrow-up
    26
    ·
    5 days ago

    @protein

    Finger Tree!

    A persistent, purely functional workhorse. Amortized O(1) access at both ends, O(log n) concatenation/splitting.

    It generalizes elegantly to build sequences, priority queues, and more. Powers Haskell’s main Data.Sequence. A functional programmer’s secret weapon.

    • Vorpal@programming.dev
      link
      fedilink
      arrow-up
      7
      ·
      5 days ago

      On paper they are efficient. In practise, all pointer based data structures (linked lists, binary trees, etc) are slow on modern hardware. And this effect is more important than the complexity in practise for most practical high performance code.

      You are far better off with linear access where possible (e.g. vectors, open addressing hash maps) or if you must have a tree, make the fan-out factor as large as possible (e.g. btrees rather than binary trees).

      Now, I don’t know if Haskell etc affords you such control, I mainly code in Rust (and C++ in the past).

      Also see this old thread from 2016 on hacker news about this very topic: https://news.ycombinator.com/item?id=13263275

      • Marc@sueden.social
        link
        fedilink
        arrow-up
        7
        arrow-down
        1
        ·
        5 days ago

        @Vorpal

        Totally fair point, thanks for calling that out.

        When I mentioned finger trees I was thinking more about the *functional* side (persistence, elegant composition, Haskell/Data.Sequence style usage) than raw performance on real hardware.

        In performance‑critical code your argument for cache‑friendly linear structures and wide trees absolutely makes sense, and I appreciate the reminder to think about actual access patterns and hardware effects, not just asymptotic complexity.

        • Vorpal@programming.dev
          link
          fedilink
          arrow-up
          3
          ·
          4 days ago

          I think a lot of modern software is bloated. I remember when GUI programs used to fit on a floppy or two. Nowdays we have bloated electron programs taking hundreds of MB of RAM just to show a simple text editor, because it drags a whole browser with it.

          I love snappy software, and while I don’t think we need to go back to programs fitting on a single floppy and using hundreds of KB of RAM, the pendulum does need to swing back a fair bit. I rewrote some CLI programs in the last few years that I found slow (one my own previously written in Python, the other written in C++ but not properly designed for speed). I used Rust, which sure helped compared to Python, but the real key was thinking carefully about the data structures used up front and designing for performance. And lots of profiling and benchmarking as I went along.

          The results? The python program was sped up by 50x, the C++ program by 320x. In both cases it changed these from “irritating delay” to “functionally instant for human perception”.

          The two programs:

          And I also rewrote a program I used to manage Arch Linux configs (written in bash) in Rust. I also added features I wanted so it was never directly comparable (and I don’t have numbers), but it made “apply configs to system” take seconds instead of minutes, with several additional features as well. (https://github.com/VorpalBlade/paketkoll/tree/main/crates/konfigkoll)

          Oh and want a faster way to check file integrity vs the package manager on your Linux distro? Did that too.

          Now what was the point I was making again? Maybe I’m just sensitive to slow software. I disable all animations in GUIs after all, all those milliseconds of waiting adds up over the years. Computers are amazingly fast these days, we shouldn’t make them slower than they have to be. So I think far more software should count as performance critical. Anything a human has to wait for should be.

          Faster software is more efficient as well, using less electricity, making your phone/laptop battery last longer (since the CPU can go back to sleep sooner). And saves you money in the cloud. Imagine if you could save 30-50% on your cloud bill by renting fewer resources? Over the last few years I have seen multiple reports of this happening when companies rewrite in Rust (C++ would also do this, but why would you want to move to C++ these days?). And hyperscalers save millions in electricity by optimising their logging library by just a few percent.

          Most modern software on modern CPUs is bottlenecked on memory bandwidth, so it makes sense to spend effort on data representation. Sure start with some basic profiling to find obvious stupid things (all non-trivial software that hasn’t been optimised has stupid things), but once you exhausted that, you need to look at memory layout.

          (My dayjob involves hard realtime embedded software. No, I swear that is unrelated to this.)

      • coherent_domain@infosec.pub
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        5 days ago

        I don’t know if Haskell etc affords you such control

        You can have immutable arrary with vectors, but to mutate them you will need to wrap your action in a Monad. It even supports unboxed values.

        https://hackage.haskell.org/package/vector

        But I agree boxed default actually causes a lot of performance overhead in many high-level languages.

  • QueenMidna@lemmy.ca
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 days ago

    I’ve been knee-deep in these lately so I’m a big fan

    Theta sketches!

    Do you want to approximately count a large volume of items, but save the state for later so you can UNION , INTERSECT and even DIFF them? Then Thetas are right for you!

    Or basically anything in the Apache Datasketches lbrary.

  • Gobbel2000@programming.dev
    link
    fedilink
    arrow-up
    14
    ·
    5 days ago

    The CSR (compressed sparse row) format is a very simple but efficient way of storing sparse matrices, meaning matrices with a large amount of zero entries, which should not all occupy memory. It has three arrays: one holds all non-zero entries in order, read row by row, the next array contains the column indices of each non-zero element (and therefore has the same length as the first array), the third array indices into the first array for the first element of each row, so we can tell where a new row starts.

    On sparse matrices it has optimal memory efficiency and fast lookups, the main downside is that adding or removing elements from the matrix requires shifting all three arrays, so it is mostly useful for immutable data.

    • jxk@sh.itjust.works
      link
      fedilink
      arrow-up
      1
      ·
      2 days ago

      Oh yeah that’s a good one

      And also, if you’re representing a 0/1 matrix, you can just do away with the first column altogether.

      • Gobbel2000@programming.dev
        link
        fedilink
        arrow-up
        1
        ·
        2 days ago

        Right, which occurs in particular when dealing with graphs, which are basically matrices and usually sparse. Large graphs are what I used this format for, however I also needed edge weights, so the first column was still required for that.

  • Atlas_@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    4 days ago

    Fibonacci heaps are pretty cool. Not used very often b/c they’re awful to implement, but better complexity than many other heaps.

    Also Binary Lifting is closer to an algorithm than a data structure but it’s used in Competitive Programming a fair bit, and isn’t often taught: https://cp-algorithms.com/graph/lca_binary_lifting.html

    And again closer to an algo tham a data structure, but Sum over Subsets DP in 3^n also has a cool little bit of math in it: https://cp-algorithms.com/algebra/all-submasks.html

  • xthexder@l.sw0.com
    link
    fedilink
    arrow-up
    12
    ·
    edit-2
    2 days ago

    I came up with a kind of clever data type for storing short strings in a fixed size struct so they can be stored on the stack or inline without any allocations.
    It’s always null-terminated so it can be passed directly as a C-style string, but it also stores the string length without using any additional data (Getting the length would normally have to iterate to find the end).
    The trick is to store the number of unused bytes in the last character of the buffer. When the string is full, there are 0 unused bytes and the size byte overlaps the null terminator.
    (Only works for strings < 256 chars excluding null byte)

    Implementation in C++ here: https://github.com/frustra/strayphotons/blob/master/src/common/common/InlineString.hh

    Edit: Since a couple people don’t seem to understand the performance impact of this vs regular std::string, here’s a demo: https://godbolt.org/z/34j7obnbs This generates 10000 strings like “Hello, World! 00001” via concatenation. The effect is huge in debug mode, but still has performance benefits with optimizations turned on:

    With -O3 optimization
    std::string: 0.949216ms
    char[256] with strlen: 0.88104ms
    char[256] without strlen: 0.684734ms
    
    With no optimization:
    std::string: 3.5501ms
    char[256] with strlen: 0.885888ms
    char[256] without strlen: 0.687733ms
    
    (You may need to run it a few times to get sample numbers due to random server load on godbolt)
    Changing the buffer size to 32 bytes has a negligible performance improvement over 256 bytes in this case, but might be slightly faster due to the whole string fitting in a cache line.
    
    • anton@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      1
      ·
      3 days ago

      I came up with a kind of clever data type for storing short strings in a fixed size struct so they can be stored on the stack or inline without any allocations.

      C++ already does that for short strings while seamlessly switching to allocation for long strings.

      It’s always null-terminated so it can be passed directly as a C-style string, but it also stores the string length without using any additional data (Getting the length would normally have to iterate to find the end).

      Also the case in the standard library

      The trick is to store the number of unused bytes in the last character of the buffer. When the string is full, there are 0 unused bytes and the size byte overlaps the null terminator.

      Iirc, that trick was used in one implementation but discontinued because it was against the standard.

      (Only works for strings < 256 chars excluding null byte)

      If you need a niche for allocated string you can get to 254 but the typical choice seem to be around 16.

      • xthexder@l.sw0.com
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        3 days ago

        C++ already does that for short strings

        I’ve already been discussing this. Maybe read the rest of the thread.

        Also the case in the standard library

        I think you’re missing the point of why. I built this to be a nearly drop in replacement for the standard string. If this wasn’t the case it would need to do even more processing and work to pass the strings to anything.

        discontinued because it was against the standard.

        Standards don’t matter for an internal type that’s not exposed to public APIs. I’m not trying to be exactly compatible with everything under the sun. There’s no undefined behavior here so it’s fine

      • xthexder@l.sw0.com
        link
        fedilink
        arrow-up
        5
        ·
        4 days ago

        22 characters is significantly less useful than 255 characters. I use this for resource name keys, asset file paths, and a few other scenarios. The max size is configurable, so I know that nothing I am going to store is ever going to require heap allocations (really bad to be doing every frame in a game engine).

        I developed this specifically after benchmarking a simpler version and noticed a significant amount of time being spent in strlen(), and it had real benefits in my case.
        Admittedly just storing a struct with a static buffer and separate size would have worked pretty much the same and eliminated the 255 char limitation, but it was fun to build.

        • FizzyOrange@programming.dev
          link
          fedilink
          arrow-up
          1
          ·
          3 days ago

          22 characters is significantly less useful than 255 characters.

          You can still use more than 22 characters; it just switches to the heap.

          nothing I am going to store is ever going to require heap allocations

          I would put good money that using 256 bytes everywhere is going to be slower overall than just using the heap when you need more than 22 characters. 22 is quite a lot, especially for keys. ThisReallyLongKey is still only 17.

          • xthexder@l.sw0.com
            link
            fedilink
            arrow-up
            1
            ·
            3 days ago

            I don’t use 256 bytes everywhere. I use a mix of 64, 128, and 256 byte strings depending on the specific use case.
            In a hot path, having the data inline is much more important than saving a few hundred bytes. Cache efficiency plus eliminating heap allocations has huge performance benefits in a game engine that’s running frames as fast as possible.

            • FizzyOrange@programming.dev
              link
              fedilink
              arrow-up
              1
              ·
              2 days ago

              having the data inline

              It’s not as simple as that, depending on the architecture. Typically they would fetch 64-byte cache lines so your 128 bytes aren’t going to be magically more cached than 128 bytes on the heap.

              Avoiding allocations may help but also maybe not. This is definitely in “I don’t believe it until I see benchmarks” realm. I would be really really surprised if the allocation cost was remotely bad enough to justify the “sorry your file is too long” errors.

              • xthexder@l.sw0.com
                link
                fedilink
                arrow-up
                1
                ·
                2 days ago

                Check out the benchmark I edited in to my original post. These are not user-provided strings in my case.

  • myfavouritename@beehaw.org
    link
    fedilink
    English
    arrow-up
    9
    ·
    4 days ago

    I get way more use out of Doubly Connected Edge Lists (DCEL) than I ever thought I would when I first learned about them in school.

    When I want to render simple stuff to the screen, built-in functions like ‘circle’ or ‘line’ work. But for any shapes more complicated than that, I often find that it’s useful to work with the data in DCEL form.

  • litchralee@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    14
    ·
    5 days ago

    IMO, circular buffers with two advancing pointers are an awesome data structure for high performance compute. They’re used in virtualized network hardware (see virtio) and minimizing Linux syscalls (see io_uring). Each ring implements a single producer, single consumer queue, so two rings are usually used for bidirectional data transfer.

    It’s kinda obscure because the need for asynchronous-transfer queues doesn’t show up that often unless dealing with hardware or crossing outside of a single CPU. But it’s becoming relevant due to coprocessors (ie small ARM CPUs attached to a main CPU) that process offloaded requests and then quickly return the result when ready.

    • xthexder@l.sw0.com
      link
      fedilink
      arrow-up
      5
      arrow-down
      1
      ·
      edit-2
      4 days ago

      One cool trick that can be used with circular buffers is to use memory mapping to map the same block of memory to 2 consecutive virtual address blocks. That way you can read the entire contents of the buffer as if it was just a regular linear buffer with an offset.