# Troubleshooting Common Issues

<figure><img src="https://1588585907-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MTwgToRvLjYdjfpAVgP%2Fuploads%2FUUFmGdzKwHnTPy71NBGG%2Fimage.png?alt=media&#x26;token=4f2fe2cf-afb9-4f76-9b4e-51110628475c" alt=""><figcaption></figcaption></figure>

## **Cannot Reach Server**

1. **Ping the server by Hostname and IP Address:**
   * **Hostname/IP Address is pingable:**
     * The issue might be on the client side since the server is reachable.
   * **Hostname is not pingable but IP Address is pingable:**
     * Likely a DNS issue. Check:
       * `/etc/hosts`
       * `/etc/resolv.conf`
       * `/etc/nsswitch.conf`
       * **Test DNS Resolution:**
         * **Using `nslookup, dig or host`**
   * **Neither Hostname nor IP Address is pingable:**
     * Check another server on the same network:
       * **False:** Issue is with this specific host/server.
       * **True:** Likely a broader network issue.
     * Log in via Virtual Console (if the server is powered on):
       * Check uptime using command `uptime`.
       * Verify if the server has an IP and if the network interface is UP.
         * Run the command `ip addr`&#x20;
         * Ensure the network interface (e.g., `eth0`, `ens33`) is listed and in the "UP" state.
       * Ping the gateway and check routes.
       * Check SELinux and firewall rules.
       * Inspect physical cable connections.

## **Cannot Reach Website or Application**

1. **Ping the server by Hostname and IP Address:**
   * **False:** Follow troubleshooting steps from “Server is not reachable or unable to connect.”
   * **True:** Check service availability using the `telnet` command with the appropriate port:
     * **True:** The service is running.
     * **False:** The service is not reachable or running. Check:
       * Service status (using `systemctl` or equivalent commands).
       * Firewall/SELinux settings.
       * Service logs.
       * Service configuration.

## **Unable to SSH as Root or User**

1. **Ping the server by Hostname and IP Address:**
   * **False:** Follow troubleshooting steps from “Cannot Reach Server”
   * **True:** Check service availability using the `telnet` command with the SSH port:
     * **True:** The service is running:
       * Check if the issue is on the client side.
       * Verify:
         * User account is not disabled.
         * User has a valid shell (not `nologin`).
         * Root login is not disabled in the SSH configuration.
     * **False:** The service is not reachable or running. Check:
       * Service status (using `systemctl` or equivalent commands).
       * Firewall/SELinux settings.
       * Service logs.
       * Service configuration.

## **Disk Space is Full or Adding/Extending Disk Space**

1. **Detect Performance Degradation:**
   * Applications are slow or unresponsive.
   * Commands fail to execute (e.g., `/` disk space is full).
   * Logging and other system operations fail.
2. **Analyze the Issue:**
   * Use the `df` command to identify the problematic filesystem.
3. **Take Action:**
   * Use `du` to find large files/directories in the affected filesystem.
   * Compress or remove large files.
   * Move files to another partition or server.
   * Check disk health with `badblocks` (e.g., `badblocks -v /dev/sda`).
   * Identify I/O-bound processes using `iostat`.
   * Create a link to move large files/directories.
4. **Add a New Disk:**
   * **Simple Partition:**
     * Add the disk to the VM.
     * Verify the new disk using `df` or `lsblk`.
     * Use `fdisk` to create a partition (preferably LVM).
     * Create a filesystem, mount it, and add it to `fstab` for persistence.
   * **LVM Partition:**
     * Add the disk to the VM.
     * Verify with `df` or `lsblk`.
     * Use `fdisk` to create an LVM partition.
     * Set up PV, VG, and LV.
     * Create a filesystem, mount it, and add it to `fstab`.
   * **Extend LVM Partition:**
     * Add and create an LVM partition.
     * Add the new LVM partition (PV) to the existing VG.
     * Extend the LV and resize the filesystem.

## **Filesystem Corruption**

1. **Symptoms:**
   * The system fails to boot.
2. **Check Logs:**
   * Investigate `/var/log/messages`, `dmesg`, and other log files.
   * Look for bad sector logs.
3. **Run `fsck` if Bad Sectors are Found:**
   * Reboot the system into rescue mode (e.g., boot from CD-ROM or ISO).
   * Select Option 1 to mount the original root filesystem under `/mnt/sysimage`.
   * Edit `fstab` entries or recreate the file using `blkid`.
   * Reboot the system.

## **Missing or Incorrect `fstab` File**

1. **Symptoms:**
   * The system fails to boot.
2. **Check Logs:**
   * Investigate `/var/log/messages`, `dmesg`, and other log files.
   * Look for bad sector logs.
3. **Run `fsck` if Bad Sectors are Found:**
   * Reboot the system into rescue mode (e.g., boot from CD-ROM or ISO).
   * Select Option 1 to mount the original root filesystem under `/mnt/sysimage`.
   * Edit `fstab` entries or recreate the file using `blkid`.
   * Reboot the system.

## **Cannot `cd` to Directory (Even with Sudo Privileges)**

1. **Reasons and Resolutions:**
   * Directory does not exist.
   * Pathname conflict (relative vs absolute path).
   * Parent directory permission or ownership issues.
   * Missing executable permissions on the target directory.
   * Hidden directory not visible.

## **Cannot Create Links**

1. **Reasons and Resolutions:**
   * Target directory or file does not exist.
   * Pathname conflict (relative vs absolute path) — ensure the path is complete.
   * Parent directory permission or ownership issues.
   * Target file permission or ownership issues — must have read permissions.
   * Hidden directory or file not visible.

## **Running Out of Memory**

1. **Types of Memory:**
   * **Cache:** L1, L2, L3.
   * **RAM:**
     * Usage details from `free -h`:
       * **Total:** Total assigned memory.
       * **Used:** Total memory actually in use.
       * **Free:** Memory available for immediate use.
       * **Shared:** Shared memory.
       * **Buff/Cache:** Pages cached in memory.
       * **Available:** Memory that can be freed.
     * Check `/proc/meminfo` for detailed metrics:
       * File active/inactive, Anon active/inactive.
   * **Swap (Virtual Memory):** Monitor and manage for system stability.
2. **Resolutions:**
   * Identify high-memory processes using `top`, `htop`, or `ps`.
   * Check logs for OOM events and review memory overcommit settings in `sysctl.conf`.
   * Kill or restart memory-hogging processes/services.
   * Use `nice` to prioritize critical processes.
   * Add or extend swap space.
   * Install more physical RAM.

## **Add or Extend Swap Space**

1. **Steps to Add Swap Space:**
   * Create a file using `dd` to reserve disk blocks for swap.
   * Set file permissions to `600` and assign root ownership.
   * Format the file for swap with `mkswap`.
   * Enable swap using `swapon`.
   * Add the swap file to `fstab` for persistence.

## **Unable to Run Certain Commands**

1. **Troubleshooting and Resolutions:**
   * **Command issues:**
     * System-related commands may require root access.
     * User-defined scripts/commands might have restrictions.
   * **Steps to troubleshoot:**
     * Check permission or ownership of the command/script.
     * Ensure sudo privileges are configured.
     * Verify the absolute or relative path to the command/script.
     * Ensure the command is in the user's `$PATH` variable.
     * Confirm that the command is installed.
     * Check for missing or deleted command libraries.

## **System Unexpectedly Rebooting and Processes Restarting**

1. **Troubleshooting and Resolution:**
   * **System Reboot/Crash Reasons:**
     * CPU stress.
     * RAM stress.
     * Kernel fault.
     * Hardware fault.
   * **Process Restart Causes:**
     * System reboot triggers process restarts.
     * Processes might restart themselves.
     * Watchdog applications:
       * Prevent high stress on system resources.
       * Restart or terminate processes causing excessive stress.
   * **Troubleshooting Steps:**
     * After logging in, check system status using commands like:
       * `uptime`, `top`, `dmesg`, `journalctl`, `iostat -xz 1`.
     * Examine log files: `syslog.log`, `boot.log`, `dmesg`, `messages.log`.
     * Check custom application log paths.
     * If inaccessible, use virtual consoles (e.g., ILO, IDRAC).
     * Open a support case with the vendor if needed.

## **Unable to Get an IP Address**

1. **IP Assignment Methods:**
   * **DHCP:**
     * Fixed Allocation.
     * Dynamic Allocation.
   * **Static IP.**
2. **Troubleshooting Steps:**
   * Check network settings in the virtualization environment (e.g., VMware, VirtualBox).
   * Verify whether an IP address has been assigned.
   * Check the NIC status on the host using tools like `lspci`, `nmcli`.
   * Restart the network service.

## **Backup and Restore File Permissions in Linux**

1. **Backup and Restore Steps:**
   * The best option is to create an ACL file for directories/files before making bulk permission changes:
     * Backup file permissions: `getfacl -R <dir> > permissions.acl`.
     * Restore file permissions: `setfacl --restore=permissions.acl`.
   * Restore using a VM snapshot (not ideal for production environments).
   * Rebuild the VM (a safer option for long-term stability).

## **Useful Tips Related to Disk Partitioning**

1. **Tips for Managing Disk Partitions:**
   * After attaching a new disk to a VM, use `lsblk` to check its status, then rescan using:
     * `echo 1 > /sys/block/sda/device/rescan`.
   * Increasing the size of an existing disk appends additional space to the disk without affecting the existing file system or partition.
   * Recreating the filesystem on a block device automatically formats the old one.
   * For a disk with an existing partition/filesystem, share the `.vmdk` file to another VM. After mounting, the data will remain identical.
