FFmpeg moves to Forgejo

Clot@lemmy.zip · 5 months ago

FFmpeg moves to Forgejo

psycotica0@lemmy.ca · 4 months ago

Okay, cheat sheet time!

git add just adds things to the index. It also works to add new files to git, because git only ever works on files it already knows about, so the first time a new file is created, you have to add it so git knows to track it. Still goes in the index, though.

git add -p goes through the diff of your working directory and asks if you’d like this change in the index or not. Notably it doesn’t ask about new files, you’ll still have to add those.

git status, so useful, but also simple. Tells you what branch you’re on, what files have been changed since the version in the index, what files have been changed in the index (and so what’s going to be committed at the next commit), and what files exist that git doesn’t know about and you might want to add.

Speaking of which, having a bunch of files here that aren’t in git can be a hazzard because it makes it really easy to forget about a new file that you actually did want to add. If there’s a file that will be sticking around for a while that you don’t want to add to the repository, you should tell git to ignore it. If it’s a file that everyone who uses this repo will encounter, like a build or some packages that get fetched, it should go in the .gitignore file, which then gets checked in and synced. If it’s something that only you will have, you can instead put it in .git/info/exclude and it will not be checked in and will just exist in your folder. This will help keep the git status relevant and actionable.

git commit stores the index and makes a commit out of it, asking you for a message to go along with it. It also moves your branch head to this new commit, if you’re on a branch, which you should be most of the time.

git commit -a is a useful shortcut for people who know what they’re doing, which is then taught in every intro tutorial to people who don’t know what they’re doing. It just adds all changes before doing a commit, which effectively skips the index as a concept. Which is fine if there’s no temporary or unrelated changes, but often ends up with people not looking over their changes and adding random test garbage to commits without realizing. See git add -p above. It also doesn’t add new files, which means it works without having to think about it 95% of the time, but then people create a new file and don’t check it in for 10 commits and everything is broken for everyone else. This is a mistake anyone can make, even git add -p folk, and the only cure is actually checking git status, and noticing when it’s warning you about new files.

git add . adds all files in the directory to the index. Also a kind of habit some people get into when they “just want all the changes” but also often ends up with a bunch of garbage being accidentally checked in, like API keys or downloads or patch files or whatever else is in their working dir. It does respect the ignore files, though, so it can be useful if you’re careful.

git diff on its own tells you the difference between the files actually in your working directory (the folder on disk) and the index. Not the last commit, like it may seem, but the index, which when empty is equivalent to the last commit. Basically, this tells you the changes you haven’t added yet, but doesn’t list new files.

git diff HEAD does the thing people think, which compares what’s in the working dir with the latest commit. Actually any git diff COMMIT compares the working dir against that commit, and HEAD is a pointer to the current commit.

git diff COMMIT1..COMMIT2 computes the diff between the trees pointed to by those two commits.

git diff --cached is unfortunately named, but this is what shows the diff between what’s in the index and what’s in the latest commit. This is what would be committed if you ran git commit right now. Useful for making sure you haven’t accidentally added a bunch of useless stuff.

git log shows the commit history.

git log -p shows the commit history, but also precomputes the diffs between each commit and its parent so you can see the changes.

Now for the elephant in the room, git merges and rebases. Given the data model I’ve explained to you, merges are easy. We have branches because multiple different commits can claim the same parent, which allows history to diverge. But someday we may want history to come back together again, like if I branch off to work on a feature, and now the feature is done and I want to merge to the main branch. The way this works is that we make a commit that refences multiple parents, tying the two histories together. Simple! But the question is what snapshot do I store with this commit? If I pick the snapshot from either side, the other side’s changes won’t be present. What I want is to blend these snapshots, so git does what’s called a three-way merge. I first find the point where my two branches diverge, their shared common ancestor, and then I find the diff between each of the branches tips and this common ancestor. Then I try to apply these patches to the common ancestor and if both apply cleanly, then I’m done! I store that and point the commit at it, referencing both parents as I said, and now history is tied together.

If there are conflicts, though, git will dump the conflicts into the working directory and say “you figure this out” and then you manually merge what it couldn’t do automatically, and the use git add like normal to tell git “this is what my merge commit should contain”, and then it does.

So that’s merges. It’s great because it represents history, and only references previously existing commit hashes, but it’s also sometimes messy because the true history can be messy. The classic example is a feature branch that wanted to keep up with the main branch, and so has several merge commits from the main branch into the feature branch, which are still part of the history when that later gets merged to the main branch, leading to a commit graph that’s very noisy and has lots of crosses. It works, but people don’t like it.

So then there’s rebase. Before that, let’s talk about git cherry-pick. It has an easy job. It takes a commit, computes the diff between it and its parent to get the “patch”, or set of changes, this commit represented, and then tries to apply that patch, making those changes, here on the current branch. If it succeeds it makes a new commit that has the same message as the one that’s being cherry picked, and if there’s conflicts it asks the human to fix them like normal before doing the add and commit steps. So it’s trying to “pick-up that patch and put it here”, replicating it’s outcome in a new context. And it makes a new commit that looks like the old one for consistency. But this is important! It looks like the old one, but it is not the same as the old one. Remember, what gives a commit it’s identity is its hash, and its hash comes from its content. And the content is not the diff. That’s computed. The content is the commit message, which is the same, but also the parent commit which is totally different, and the snapshot of the entire set of files, which will also be totally different. Sure, the patch will be the same because it was based on the original, but presumably the other files on this branch aren’t the same, and maybe even other parts of the files this patch touches will be different. That’s the point of the cherry-pick, to take this change set and transplant it into a new context. Well, that new context has new file contents and a new parent, which means new hashes, which means this commit has a new commit hash and is effectively totally different, despite having the same message. And if there were conflicts, it might not even end up with the same patch, just a similar one.

Okay, so that’s git cherry-pick. But what if I’m on a branch with multiple commits that I want to “catch up” to the main branch. I can just find all the commits this branch has that the main branch doesn’t, switch to the main branch, and then cherry pick the old commits one after the other. Now I’ll be on a new branch, on a new commit, but it will “feel like” the old one, with the same changes, but updated to be “re-based” on the new main branch. As in, the branch branches off main at a different point. The base is different. It was rebased. Get it!?

You can use git rebase -i to actually see what it’s about to do beforehand. It finds a bunch of commits and then gets ready to pick them.

This can be great, but can also be a nightmare. Mostly because the hashes of everything has changed. When collaborating with people, they’ll see a branch be at one commit, and then the next time they look it’ll have jumped to a completely different set of commits that don’t follow from the one they used to know. They’re not in the history of the new commits, it’s just different. This makes them grumpy.

And because the new commits are unique, if you’ve messed up your history before you can end up with the “same” commit multiple times in history,. because actually they’re different rebased copies of each other. And rebasing a previous merge commit and be a real beast because it just makes things more complicated.

Anyway, it’s not a problem problem, it’s just something to be careful about.

psycotica0@lemmy.ca · 4 months ago

And now I’m running out of time, but there’s one more thing I want to talk about, which is my best friend git reflog.

git reflog is just a log of all the commit hashes you’ve ever been at, and why it changed. Using this you can recover from almost anything you do within git. Bad rebase? That’s okay, branches are just pointers to commit hashes, and the old commits hashes are still there, same as they ever were. And the reflog remembers what those hashes were. Accidentally reset your branch to a bad place? Git reflog knows how to find your way home. Deleted a branch that still had a change on it you forgot to merge? The name may be gone, but the hash isn’t. Reflog knows its old address, and you can just point a new name there, or inspect its log by hash, or cherry-pick it.

Git reflog loves you.

And now I have to go, but maybe I’ll say more later.