Advanced Git & GitHub Techniques

  • Session Flow

    • Learning Objective

      • Introduction

      • Theme

      • Primary Goals

    • Advanced Git & GitHub Techniques

      • Git Rebase

        • How Git Rebase work?

        • Benefits of Git Rebase

        • Use Cases of Git Rebase

      • Git Cherry-Pick

        • How Git Cherry-Pick works?
      • Git Worktrees

      • Activity: Fill in the blanks

      • GitHub Actions

        • key Concepts

        • How GitHub Action Works

        • Benefits and Use Cases

      • GitHub Pages

      • GitHub CLI

        • How to use GitHub CLI for Authentication?
      • Activity: True or False

    • Summary

      • What did we learn?

      • Shortcomings & Challenges

      • Best Practices

    • Enhance Your Knowledge

    • Try it yourself

Learning Objective

Introduction

Advanced Git and GitHub take your version control and collaboration skills to the next level. With advanced Git techniques, you can streamline your workflow, manage complex projects, and collaborate more effectively. GitHub offers powerful features like automated workflows and security tools to supercharge your code development and collaboration efforts.

Focus: GitHub Actions, GitHub Advanced Features, Advanced Git Techniques

Prerequisites: Git and GitHub basics

Theme

Amazon uses Git as the primary version control system for tracking code changes in its software projects. Teams create Git repositories for their projects to manage and collaborate on code. Code quality is crucial at Amazon, so they use Git's pull request feature or similar code review mechanisms to ensure that code changes meet quality standards before merging. Git and GitHub enable Amazon's geographically dispersed development teams to collaborate effectively. Teams can work on their branches and merge changes using Git's branching and merging capabilities.

Netflix, known for its microservices architecture, uses Git to manage an extensive codebase spanning many repositories. Git's distributed nature allows teams to work independently. Netflix uses GitHub for collaboration across teams and departments. Engineers can easily fork repositories, make changes, and create pull requests for cross-team collaboration. Netflix has developed custom tools and utilities around Git and GitHub to meet the unique needs of their development workflow, including versioning of internal libraries.

Primary Goals

  • Understand advanced branching strategies like Gitflow, GitHub Flow, or other customized workflows that suit your project's needs.

  • Learn how to merge and rebase branches appropriately, considering factors like commit history and collaboration with others.

  • Learn strategies for managing large repositories efficiently, including techniques for dealing with large files and optimizing Git performance.

  • Understand and utilize GitHub's advanced features like GitHub Actions for CI/CD and code scanning to enhance your development workflow.

Advanced Git & GitHub Techniques

Git Rebase

Git rebase is a powerful and often-used Git command that allows you to reorganize, combine, or modify the commit history of a Git branch. It differs from git merge, which combines changes from one branch into another, in that it "replays" the commits from one branch onto another branch, resulting in a more linear and streamlined history. Here's a detailed explanation of Git rebase:

Basic Git Rebase Syntax:

The basic syntax for a Git rebase is as follows:

git rebase <base_branch>
  • <base_branch> is the branch onto which you want to rebase your current branch.

How Git Rebase Works

  1. Select a Base Branch: You start by checking out the branch you want to rebase (the "topic branch"). Then, you specify a base branch (the branch onto which you want to apply the changes from the topic branch).

  2. Identify Common Ancestor: Git identifies the common ancestor commit between the base branch and the topic branch. This is the point at which the two branches diverged.

  3. Save Changes: Git saves the changes introduced in the topic branch since the common ancestor as a temporary area (the "stash").

  4. Move to Base Branch: Git moves the current branch pointer to the same commit as the base branch. This effectively makes it appear as if you started working on the topic branch from the current state of the base branch.

  5. Apply Changes: Git then re-applies the saved changes (commits) from the topic branch one by one on top of the base branch. This process is akin to replaying the commits, essentially creating new commits with the same changes but a different commit history.

  6. Conflict Resolution: If there are conflicts during the rebase process, Git stops and prompts you to resolve them manually, just like in a merge conflict. Once resolved, you continue the rebase by using git rebase --continue.

  7. Finish the Rebase: After all the commits are successfully replayed, the rebase is complete. The commit history now appears as if the changes were made directly on top of the base branch, resulting in a linear history.

Benefits of Git Rebase

  1. Cleaner Commit History: Rebasing helps maintain a cleaner and more linear commit history by removing unnecessary merge commits that can clutter the history.

  2. Improved Readability: A linear history is easier to read and understand, making it simpler to trace the evolution of the codebase.

  3. Conflict Resolution Control: Rebasing allows you to resolve conflicts as they arise, making it easier to manage conflicts and maintain a cleaner history compared to merging.

Use Cases for Git Rebase

  1. Feature Branch Updates: Developers often use rebase to keep their feature branches up to date with the latest changes from the main branch before merging.

  2. Squashing Commits: You can use rebase to squash multiple small, related commits into a single, more meaningful commit with a clear and concise message.

  3. Maintaining a Clean History: Rebasing can help maintain a clean and linear commit history, which is especially useful in open-source projects where a clean history is valued.

Note: While rebasing offers many benefits, it should be used with caution, especially when rebasing commits that have already been pushed to a shared repository. Rewriting history can cause problems for collaborators, so it's best suited for branches that are not yet shared or for branches that you explicitly coordinate with your team.

Git Cherry-Pick

git cherry-pick is a Git command used to apply a specific commit from one branch to another. This can be useful when you want to incorporate changes from one branch into another without merging the entire branch. Cherry-picking allows you to pick and choose individual commits and apply them selectively to your current branch. Here's a detailed explanation of how git cherry-pick works:

Basic Syntax:

git cherry-pick <commit-hash>

How git cherry-pick Works

  1. Identify the Target Commit: First, you need to identify the commit you want to pick. This is typically done by noting the commit's unique hash or by using tools like git log to find the commit in the branch's history.

  2. Checkout the Destination Branch: Make sure you are on the branch where you want to apply the changes. Use git checkout <destination-branch> to switch to the target branch.

  3. Cherry-Pick the Commit: Run the git cherry-pick command, specifying the hash of the commit you want to apply. For example:

     git cherry-pick <commit-hash>
    

    Git will take the changes introduced by the specified commit and attempt to apply them to the current branch. If there are no conflicts, Git will automatically apply the changes and create a new commit with the same changes on the current branch. If there are conflicts, Git will pause the cherry-pick process, allowing you to resolve the conflicts manually before proceeding.

  4. Resolve Conflicts (If Necessary): If there are conflicts between the changes in the picked commit and the current state of the branch, Git will pause the cherry-pick process and mark the conflicted files. You must resolve these conflicts manually by editing the affected files. After resolving conflicts, add the changes using git add and continue the cherry-pick process by running git cherry-pick --continue.

  5. Commit the Changes: Once the cherry-pick is complete (either automatically or after resolving conflicts), Git will create a new commit on the current branch with the changes from the picked commit. You can modify the commit message if needed.

  6. Finish the Cherry-Pick: If you encounter multiple commits to cherry-pick, you can repeat steps 3 to 5 for each commit. Use git cherry-pick --continue after resolving conflicts to continue the process for subsequent commits.

  7. Push Changes (If Applicable): After successfully cherry-picking commits and resolving conflicts (if any), you can push the changes to the remote repository if you're working with a shared branch.

Notes:

  • Cherry-picking is often used when you want to apply specific bug fixes or features from one branch to another without merging the entire branch.

  • It's important to ensure that the picked commit is relevant and compatible with the target branch, as conflicts and issues can arise if there are significant differences.

  • After cherry-picking, the commit hashes of the cherry-picked commits on the target branch will be different from the original commits in the source branch.

  • git cherry-pick can be a valuable tool for maintaining a clean and focused commit history, especially in long-running projects with multiple branches and contributors.

Git Worktrees

Git worktrees are a feature in Git that allow you to have multiple working directories (or worktrees) associated with a single Git repository. This feature is particularly useful when you need to work on multiple branches or areas of your project concurrently without having to switch back and forth between different directories. Git worktrees were introduced in Git version 2.5.

How Git Worktrees Work

In a typical Git workflow, you have a single working directory (your project folder) associated with one Git repository. When you switch branches or create new branches, you're essentially changing the state of your working directory to match the state of the branch you're on. This can be cumbersome when you want to work on multiple branches at the same time.

Git worktrees solve this problem by allowing you to create additional working directories (worktrees) that are associated with the same Git repository. Each worktree is essentially a separate, isolated instance of your project that can have its own branch checked out. These worktrees share the same repository data, including the commit history and object database, but have their own separate working directories.

Creating a Git Worktree:

To create a new Git worktree, you use the git worktree add command. Here's the basic syntax:

git worktree add <path-to-new-worktree> <branch-name>
  • <path-to-new-worktree> is the path where the new worktree will be created.

  • <branch-name> is the name of the branch you want to check out in the new worktree.

For example, to create a new worktree in a subdirectory called my-feature and check out the feature-branch in that worktree, you would run:

git worktree add my-feature feature-branch

Using Git Worktrees:

Once you've created a worktree, you can work in it just like you would in your main working directory. You can make changes, commit them, switch branches, and perform other Git operations. Any changes made in a worktree are isolated from the other worktrees and won't affect them until you explicitly push or pull changes.

Benefits of Git Worktrees:

  1. Concurrent Development: You can work on multiple branches simultaneously without having to switch back and forth between them.

  2. Isolation: Each worktree is isolated from the others, reducing the risk of accidental changes affecting different branches.

  3. Efficient Testing: Worktrees are useful for testing changes in different branches without needing to clone the repository multiple times.

  4. Independent Staging: You can stage changes in one worktree without affecting the staging area in another.

  5. Cleaner Workflow: It encourages a cleaner and more organized workflow when dealing with multiple branches and features.

Limitations and Considerations:

  • Git worktrees are not supported on Windows with case-insensitive file systems.

  • Be cautious when deleting worktrees, as it's possible to accidentally delete the main repository along with all its worktrees.

  • Git worktrees don't support lightweight (unannotated) tags.

Git worktrees are a powerful feature for managing complex workflows, allowing developers to efficiently work on multiple branches and features in parallel while maintaining the integrity of their Git repository.

GitHub Actions

GitHub Actions is a powerful and flexible continuous integration/continuous deployment (CI/CD) platform provided by GitHub. It enables you to automate various aspects of your software development workflow directly within your GitHub repository. GitHub Actions helps you build, test, and deploy your code, making it easier to maintain high-quality software and streamline your development process.

Key Concepts

  1. Workflow: A GitHub Actions workflow is a series of automated steps defined in a YAML file (usually named .github/workflows/main.yml) within your repository. These steps specify what actions should be taken when specific events occur, such as a push to the repository or the creation of a pull request.

  2. Job: A job is a set of steps within a workflow. Each job runs in a separate environment, such as a virtual machine or a container, and can be parallelized or run sequentially.

  3. Step: A step is an individual task within a job. Steps can include actions, shell commands, or custom scripts. They represent the smallest units of work in a GitHub Actions workflow.

  4. Action: Actions are reusable units of code that can be used across workflows and repositories. GitHub provides a marketplace of pre-built actions, and you can also create your custom actions to perform specific tasks.

How GitHub Actions Works

  1. Event Triggers: GitHub Actions workflows are triggered by specific events in your GitHub repository. Common triggers include pushes to the repository, pull request activity, issue comments, and more. You can specify event triggers in your workflow configuration.

  2. Workflow Configuration: You define your workflow by creating a YAML file (e.g., .github/workflows/main.yml) in your repository. This file specifies the workflow's name, the events that trigger it, and the jobs and steps to be executed.

  3. Jobs and Steps: Within your workflow, you can define multiple jobs, each with its own set of steps. These steps can include checking out the code, building and testing your application, and deploying it to a hosting environment.

  4. Runners: GitHub provides runners, which are virtual machines or containers where your jobs run. You can use GitHub-hosted runners or self-hosted runners on your infrastructure. Runners have different operating systems and software environments to match your project's needs.

  5. Matrix Builds: GitHub Actions allows you to create matrix builds, which run the same set of steps on multiple platforms or configurations in parallel. This is helpful for testing your code across different environments.

Benefits and Use Cases

  • Continuous Integration (CI): GitHub Actions enables you to automatically build and test your code with every push or pull request, ensuring that your codebase remains stable and error-free.

  • Continuous Deployment (CD): You can automate the deployment of your application to various hosting platforms, such as AWS, Azure, or GitHub Pages, after successful tests and code reviews.

  • Scheduled Jobs: You can schedule periodic jobs for tasks like data backups, reports generation, or sending notifications.

  • Custom Workflows: GitHub Actions is highly customizable. You can create workflows tailored to your project's needs, incorporating various actions and steps.

  • Integration with Third-Party Services: GitHub Actions integrates seamlessly with third-party services and tools, such as Docker, Slack, and Kubernetes, allowing you to create comprehensive CI/CD pipelines.

  • Cross-Platform Testing: You can use GitHub Actions to test your code on different operating systems, browsers, and environments to ensure cross-platform compatibility.

GitHub Actions Marketplace: GitHub offers a marketplace of pre-built actions created by the community. These actions cover a wide range of tasks, from deploying to cloud providers to sending notifications via popular messaging platforms.

In summary, GitHub Actions is a versatile CI/CD platform tightly integrated with GitHub, allowing you to automate your software development workflows, improve code quality, and accelerate the delivery of your projects. It provides powerful customization options, extensive compatibility with various tools, and seamless integration with your existing GitHub repositories.

GitHub Pages

GitHub Pages is a free hosting service provided by GitHub that allows you to publish and host static websites directly from your GitHub repositories. It's a convenient way to showcase your projects, create documentation, host personal blogs, or build simple websites without the need for a separate web hosting service. Here's a detailed explanation of GitHub Pages:

Key Features and Concepts:

  1. Static Websites: GitHub Pages is designed for hosting static websites. Static websites consist of HTML, CSS, JavaScript, and other static assets that do not require server-side processing or databases. This makes them fast, secure, and easy to deploy.

  2. Repository-Based Hosting: GitHub Pages hosts websites directly from GitHub repositories. Each GitHub repository can have its GitHub Pages site, which is accessible using a specific URL based on your GitHub username and repository name.

  3. Branch-Based Deployment: GitHub Pages allows you to choose the source branch for your website. By default, the service looks for an index.html file in the main branch (formerly known as master), but you can configure it to use other branches or directories as the source.

Setting Up GitHub Pages:

To create a GitHub Pages site, follow these steps:

  1. Create a Repository: If you don't have one already, create a new GitHub repository or use an existing one to store your website's files.

  2. Add Content: Add your website's HTML, CSS, JavaScript, and any other static files to your repository.

  3. Configure Settings:

    • In your repository, navigate to the "Settings" tab.

    • Scroll down to the "GitHub Pages" section.

    • Choose the source branch (e.g., main) or directory for your website's content.

  4. Publish Your Site: After configuring the source branch or directory, your GitHub Pages site is automatically deployed and made accessible at a URL based on your GitHub username and repository name. It typically follows the pattern https://<username>.github.io/<repository>.

Custom Domains:

GitHub Pages allows you to use a custom domain (e.g., www.yourwebsite.com) instead of the default GitHub Pages URL. To set up a custom domain:

  1. Purchase a domain name from a domain registrar.

  2. In your repository's "Settings," scroll down to the "GitHub Pages" section.

  3. Under "Custom domain," enter your domain name (e.g., www.yourwebsite.com).

  4. Configure your domain registrar's DNS settings to point to GitHub's IP addresses. GitHub provides specific IP addresses and detailed instructions for this setup.

  5. Once DNS propagation is complete (which may take some time), your GitHub Pages site will be accessible via your custom domain.

HTTPS Support:

GitHub Pages automatically provides HTTPS (SSL/TLS encryption) for your website, both for the default GitHub Pages URL and custom domains with proper DNS configuration. This enhances security and ensures that your visitors' connections are encrypted.

Jekyll Integration:

GitHub Pages has built-in support for Jekyll, a popular static site generator. You can use Jekyll to create dynamic content, blog posts, and templates for your GitHub Pages site. When you push Jekyll-powered content to your repository, GitHub Pages automatically builds and deploys your site.

GitHub Pages for User/Organization Pages:

In addition to project-specific Pages sites, GitHub Pages also supports personal and organization Pages. User Pages are hosted at https://<username>.github.io, while organization Pages are hosted at https://<orgname>.github.io. These Pages are often used for personal blogs, portfolios, or documentation.

GitHub Pages is a versatile and user-friendly hosting service that empowers developers to easily share their static websites and web content with the world. Whether you're showcasing personal projects, creating documentation, or building a simple website, GitHub Pages simplifies the deployment process and provides a reliable hosting solution.

GitHub CLI

GitHub CLI, often referred to as "gh," is a command-line interface provided by GitHub for interacting with GitHub repositories and performing various GitHub-related tasks directly from the terminal. It offers a convenient way for developers and teams to manage repositories, issues, pull requests, and more without leaving the command-line environment. Here's a detailed explanation of GitHub CLI:

Installation:

  • To get started with GitHub CLI, you need to install it on your computer. GitHub CLI is available for Windows, macOS, and Linux.

Authentication:

  • After installation, you'll need to authenticate your GitHub account with GitHub CLI. You can use either a personal access token or authenticate through the web browser. Once authenticated, your credentials are securely stored, so you don't have to log in every time.

Basic Commands:

  1. Creating a Repository:

    • You can create a new GitHub repository from the command line using the gh repo create command. It guides you through the process, allowing you to choose various options like visibility, description, and license.
  2. Cloning a Repository:

    • You can clone a GitHub repository to your local machine using the gh repo clone command. It automatically sets up the remote origin and fetches the repository.
  3. Viewing Repository Information:

    • The gh repo view command provides detailed information about a GitHub repository, including its description, stars, forks, and issues.
  4. Creating and Listing Issues:

    • You can create new issues using gh issue create. You can also list, filter, and search for issues using gh issue list with various options.
  5. Managing Pull Requests:

    • The gh pr create command allows you to create new pull requests from your terminal. You can also list, view, and check out pull requests using other gh pr commands.
  6. Reviewing and Merging Pull Requests:

    • You can review pull requests using gh pr review and merge them with gh pr merge. These commands streamline the code review and merging process.
  7. Working with Branches:

    • GitHub CLI simplifies branch management with commands like gh repo fork, gh repo sync, and gh repo branch. You can create, switch, and sync branches with ease.
  8. Commenting and Interacting with Issues/Pull Requests:

    • You can comment on issues and pull requests using gh issue comment and gh pr comment. Additionally, you can perform actions like closing, reopening, or labeling with corresponding commands.

Customization and Configuration:

  • GitHub CLI can be customized through configuration files. You can set default behaviors, choose your preferred text editor for opening pull requests or issues, and define aliases for commonly used commands.

Extensions and Plugins:

  • GitHub CLI supports extensions and plugins that allow you to extend its functionality. Developers can create custom GitHub CLI commands to automate tasks specific to their workflows.

GitHub CLI in Automation:

  • GitHub CLI can be integrated into automation scripts and workflows. You can use it to trigger actions or perform GitHub-related tasks as part of your CI/CD pipelines or other automated processes.

Use Cases:

  • GitHub CLI is valuable for both individual developers and teams. It simplifies tasks related to repository management, code reviews, issue tracking, and collaboration, making it easier to work with GitHub repositories from the command line.

Overall, GitHub CLI is a powerful tool that streamlines GitHub-related tasks and integrates seamlessly with your existing command-line workflow. It's especially beneficial for developers who prefer to work primarily in the terminal or those who want to automate GitHub-related tasks in their development pipelines.

How to use GitHub CLI for Authentication?

You can perform authentication using a personal access token (PAT) in GitHub CLI (gh) by configuring it as your authentication method. Here are the steps to authenticate using a token:

  1. Generate a Personal Access Token:

    • Log in to your GitHub account.

    • Click on your profile picture in the upper-right corner, then click on "Settings."

    • In the left sidebar, click on "Developer settings."

    • Under "Personal access tokens," click on “Tokens (Classic)”.

    • Click the "Generate new token (classic)" button.

    • Provide a note for the token, choose the desired scopes (permissions), and set an expiration date if needed.

    • Click the "Generate token" button at the bottom.

      Note: Make sure to copy the generated token immediately because you won't be able to see it again. If you lose the token, you'll need to regenerate a new one.

  2. Configure GitHub CLI to Use the Token:

    • Open your terminal or command prompt.

    • Use the following command to set your GitHub CLI token:

        gh auth login
      
    • Follow the prompts:

      • Choose "GitHub.com" as the hostname.

      • Select “HTTPS” as your preferred protocol.

      • Select “No” for authenticating Git with your GitHub credentials.

      • Select “paste an authentication token” .

      • Enter the personal access token you generated in the previous step.

  3. Test the Authentication:

    • After configuring the token, you can test the authentication by running a simple command, such as gh repo list.

    • If the token is correctly configured, you should see a list of your repositories(if there any).

Your GitHub CLI is now authenticated using the personal access token. You can use it to perform various GitHub-related actions without the need to log in repeatedly. The token provides the necessary authentication and authorization to interact with your GitHub repositories and resources securely.

Keep in mind that personal access tokens are sensitive information. Store them securely and do not share them publicly or in plain text files. If you ever need to revoke the token's access, you can do so in your GitHub account settings and generate a new one.

Summary

What did we learn?

  • Git rebase is a Git command used to reorganize or combine commits in a branch to create a cleaner and more linear commit history.

  • Git rebase works by moving or applying a series of commits from one branch to another.

  • The benefits of Git rebase include maintaining a more linear commit history, simplifying code reviews, and reducing merge conflicts.

  • Git rebase is commonly used for feature branch integration, squashing multiple commits into one, and cleaning up commit history before merging into a shared branch.

  • Git cherry-pick is a Git command used to apply specific commits from one branch to another.

  • Git worktrees allow you to have multiple working directories for the same Git repository, each with its own branch.

  • GitHub Actions is a CI/CD (Continuous Integration/Continuous Deployment) platform integrated with GitHub.

  • GitHub Actions automates tasks like building, testing, and deploying code by running actions in response to events, such as code pushes, pull requests, or issue comments.

  • The benefits of GitHub Actions include automation, custom workflows, and integration with GitHub repositories.

  • GitHub Pages is a feature of GitHub that allows you to host static websites directly from your GitHub repositories.

  • GitHub CLI (gh) is a command-line interface for GitHub. It lets you interact with GitHub repositories, issues, pull requests, and more directly from the terminal.

Shortcomings & Challenges

  • Rebase can lead to a more linear but potentially complex commit history, making it harder to trace the chronological order of changes.

  • Handling conflicts during rebase can be challenging, and it may require manual intervention.

  • Managing multiple worktrees for the same repository can be confusing for beginners.

  • Git Worktrees require Git version 2.5 or higher.

Best practices to follow

  • Avoid rebasing commits on branches that multiple team members are using, as it can cause confusion and conflicts for others.

  • Worktrees are great for working on multiple branches simultaneously. Use them when you need to switch between branches quickly without committing or stashing changes.

  • Periodically clean up unused worktrees to avoid clutter and confusion. You can use the git worktree prune command to remove them.

  • When using GitHub Actions, store sensitive information (e.g., API keys, credentials) as GitHub Secrets rather than hardcoding them in your workflow files.

  • GitHub CLI allows you to create custom aliases for commands. Define aliases for frequently used commands to save time and reduce the chance of errors.

Enhance your knowledge

Did you find this article valuable?

Support Arul Johnson by becoming a sponsor. Any amount is appreciated!