Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)H
Posts
0
Comments
229
Joined
3 yr. ago

  • Its because the comments he made are inconsistent with common conventions in data engineering.

    1. It is very common not to deduplicate data and instead just append rows, The current value is the most recent and all the old ones are simply historical. That way you don't risk losing data and you have an entire history.
      1. whilst you could do some trickery to deduplicate the data it does create more complexity. There's an old saying with ZFS: "Friends don't let friends dedupe" And it's much the same here.
      2. compression is usually good enough. It will catch duplicated data and deal with it in a fairly efficient way, not as efficient as deduplication but it's probably fine and it's definitely a lot simpler
    2. Claiming the government does not use SQL
      1. It's possible they have rolled their own solution or they are using MongoDB Or something but this would be unlikely and wouldn't really refute the initial claim
      2. I believe many other commenters noted that it probably is MySQL anyway.

    Basically what he said is incoherent inconsistent with typical practices among data engineers to anybody who has worked with larger data.

    In terms of using SQL, it's basically just a more reliable and better Excel that doesn't come with a default GUI.

    If you need to store data, It's almost always best throw it into a SQLite database Because it keeps it structured. It's standardised and it can be used from any programming language.

    However, many people use excel because they don't have experience with programming languages.

    Get chatGpt to help you write a PyQT GUI for a SQLite database and I think you would develop a high level understanding for how the pieces fit together

    Edit: @zalgotext made a good point.

  • If you need to use pytorch, ie predictive modelling using neural networks, you need to use NVIDIA.

    And rocm stuff is catching up but, atleast a few years ago, massive pita.

  • A lot of mathematical languages start from 1: R, Julia, Mathematica (and also Lua and Fish).

    I don't know why, but in, e.g. R, it doesn't bother me, I get caught by it in Lua all the time.

    I suppose it's a function of how far the array is abstracted from being pointers to an address that makes it easier to mentally switch.

  • Well it's probably both.

    Insights from the data probably were not being actioned, but I would strongly suspect that the data they are collecting simply doesn't have a lot of predictive capacity.

    However, I don't work with that specific data. I work with related data.

    The Western world is far too concerned with test scores when they are just complete and utter bs.

    I would say a test score is accurate plus or minus 40 % in terms of a student's understanding of a subject. They're just arbitrary.

  • I work with school data, I've worked with University, high school and primary school data. it is indeed all bullshit I'm large part because test scores are just noise and behaviours metrics are subjective and non standard.

    I've never been able to develop a model with any predictive capacity whatsoever at all. Moreover, visualisations only ever show correlation and often do more harm than good as staff assume their actions are causing improvement when typically advantaged students simply take advantage of more activities etc.

    The post above is certainly more insightful than Elon musk's opinion and this is coming from somebody who works with this type of data.

    Again, I wouldn't suggest pulling it all apart. I would look deeply into the problem but this is really not the worst thing they've done.

  • I think this combined with the solution provided in this comment Will be the most robust approach and solve all your problems.

    That's what I would do

  • With Arch pacman -Syu will do it for you. Generally you are encouraged to stick with the version in the repositories.

    You can install things from source by downloading the source code, building it (eg. gcc code.c or cargo build) and then copying the binary somewhere.

    Typically if you were going to install things from Source, you would write a pkgbuild for it and that would integrate it with pacman so you have a centralised manager of everything that you have installed to simplify updates and removal and conflicts etc.

    Doing this for small packages is pretty trivial and sometimes necessary. For a large package like KDE plasma It is a very large undertaking and you would never do it in practise.

    The maintainers package the desktop environment with a pkgbuild, test it, And then upload it so that you can use it.

    Also note that when the arch maintainers do package that software they compile it into a binary so you just have to download it. You don't also have to build it.

  • Mate is really nice, I was always a fan. (Although XFCE is nice too). However, I dont believe it has support for Wayland yet?

    I think LXDE has Wayland now but I haven't tried it.

  • This is based on my experience teaching at university, Your mileage may vary. This is what I found to work the best for first year students.

  • I really like Void. ZFS support is quite good (btrfs is better nowadays though so not a big deal) and it's a lot more BSD like which is very nice from a simplicity perspective. The documentation is also very good.

    Runit boots very fast and is quite simple too.

    Packages are less recent though and I've had pain with some things (eg some QT stuff, Studio and some other misc packages. Neovim + Jupyter / podman solved this for me though)

  • It is actually quite bad to use. If for whatever reason I needed a commercial OS I'd have to use MacOS at this stage.

    Microsoft has really dropped the ball in terms of quality.

  • Nah man, you just download an ISO and press next on the install screen.

    I didn't install it for my family, my siblings did and they are labourers. If something went wrong they might ask me but it's been 8 years and I've never had to touch any of the family computers. However, they are only used to browse the web, so not much to go wrong.

    I had to do a lot more maintenance on Windows a decade ago when they used excel for the family business. That was why they switched to Linux, apple sheets with MacOS was vastly more stable but Mac was $$$$, Linux was the better compromise.

    People like to simp for M$ but for stability and simplicity, Linux is vastly simpler for a home user.

    I can't comment on enterprise use, there seems to be a lot of love for Microsoft Group Policies and VMWare among IT professionals, I dont like it but it must be good -- not my area.

  • Endeavour OS. It may be a bit more hands on than something like Ubuntu/Fedora but there are ways less abstractions, better document and community support that makes it simpler over all.

    Pick up a note-taking application like Joplin or something and write down solutions to problems and you'll be fine.

    I'd recommend against Ubuntu/Fedora/Mint etc. tbh, they are simpler on the surface but there are no ing parts that make it more complex when things break.

    Play around with distrobox and docker too, that makes a lot of stuff easier.

  • I enjoyed Graphene but I had to go back to stock pixel because it was noticeably slower.

    I don't know why and I don't know if that is still the case.

    Everything worked flawlessly though, very impressive. If ever I buy a new phone I will be using it again. The only limitation, for me, was performance.

    However, I try to use my laptop for everything I do, just gym/job and the dreaded occasional phone call.

  • Absolutely the best way to learn though. The number of places I've walked into that had no clue about containers or even a vpc and thought Google drive was an API is too damn high.

  • I don't, I'm not sure if I'm in the minority. I just plug in my laptop or cast my phone (jellyfin or any other misc streaming service).

  • Informative post, thanks. I think a boxplot would have worked better here.

  • Mobile offline sync is a lost cause. The dev environment, even on Android, is so hostile you'll never get a good experience.

    Joplin comes close, but it's still extremely unreliable and I've had many dropped notes. It also takes hours to sync a large corpus.

    I wrote my own web app using Axum and flask that I use. Check out dokuwiki as well.

  • Oh good to know.

    It used to be awful but I'm glad to hear it's improving.