Files
the_information_nexus/tech_docs/linux/Command-Line-Mastery-for-Web-Developers.md

57 KiB

Command Line Mastery for Web Developers

Introduction to Command Line for Web Development

  • Why Command Line: Importance in modern web development.
  • Getting Started: Basic CLI commands, navigation, file manipulation.

Advanced Git Techniques

  • Rebasing and Merging: Strategies for clean history and resolving conflicts.
  • Bisect and Reflog: Tools for debugging and history traversal.
  • Hooks and Automation: Customizing Git workflow.

NPM Mastery

  • Scripting and Automation: Writing efficient NPM scripts.
  • Dependency Management: Handling version conflicts, updating packages.
  • NPM vs Yarn: Comparing package managers.

Automating with Gulp

  • Setting Up Gulp: Basic setup and configuration.
  • Common Tasks: Examples like minification, concatenation, and image optimization.
  • Optimizing Build Process: Streamlining tasks for efficiency.

Bash Scripting Essentials

  • Script Basics: Writing and executing scripts.
  • Useful Commands: Loops, conditionals, and input handling.
  • Real-World Scripts: Practical examples for automation.

SSH for Secure Remote Development

  • Key Management: Creating and using SSH keys.
  • Remote Commands: Executing commands on remote servers.
  • Tunneling and Port Forwarding: Secure access to remote resources.

Command Line Debugging Techniques

  • Basic Tools: Introduction to tools like curl, netstat, top.
  • Web-Specific Debugging: Analyzing network requests, performance issues.
  • Logs Analysis: Working with access and error logs.

Docker Command Line Usage

  • Docker CLI Basics: Common commands and workflows.
  • Dockerfiles: Creating and understanding Dockerfiles.
  • Container Management: Running, stopping, and managing containers.

Command Line Version Control

  • Version Control Systems: Git, SVN command line usage.
  • Branching and Tagging: Best practices for branch management.
  • Stashing and Cleaning: Managing uncommitted changes.

Performance Monitoring via CLI

  • Tools Overview: htop, vmstat, iostat.
  • Real-Time Monitoring: Tracking system and application performance.
  • Bottleneck Identification: Finding and resolving performance issues.

Securing Web Projects through CLI

  • File Permissions: Setting and understanding file permissions.
  • SSL Certificates: Managing SSL/TLS for web security.
  • Security Audits: Basic command line tools for security checking.

Text Manipulation and Log Analysis

  • Essential Commands: Mastery of sed, awk, grep.
  • Regular Expressions: Using regex for text manipulation.
  • Log File Parsing: Techniques for efficient log analysis.

Interactive Examples and Challenges

  • Practical Exercises: Step-by-step challenges for each section.
  • Solution Discussion: Explaining solutions and alternatives.

Resource Hub

  • Further Reading: Links to advanced tutorials, books, and online resources.
  • Tool Documentation: Official documentation for the mentioned tools.

FAQ and Troubleshooting Guide

  • Common Issues: Solutions to frequent problems and errors.
  • Tips and Tricks: Enhancing usability and productivity.

Glossary

  • Key Terms Defined: Clear definitions of CLI and development terms.

Command Line Mastery for Web Developers

Introduction to Command Line for Web Development

Why Command Line: Importance in Modern Web Development

The command line interface (CLI) has become an indispensable tool for modern web developers. While graphical interfaces provide convenience, the CLI offers unmatched power, speed, and automation capabilities that are essential in today's development landscape.

Key Benefits:

  • Speed and Efficiency: Execute complex operations with simple commands faster than navigating through GUI menus
  • Automation: Script repetitive tasks to save time and reduce human error
  • Remote Server Management: Essential for deploying and maintaining web applications on remote servers
  • Tool Integration: Most modern development tools (Git, Node.js, Docker, etc.) are CLI-first
  • Precision Control: Fine-grained control over system operations and configurations
  • Debugging Power: Access to system logs, process monitoring, and diagnostic tools
  • Universal Skills: CLI knowledge transfers across different operating systems and environments

Getting Started: Basic CLI Commands, Navigation, File Manipulation

Essential Navigation Commands:

# Directory navigation
pwd                    # Print working directory
ls                     # List directory contents
ls -la                 # List with detailed info including hidden files
cd /path/to/directory  # Change directory
cd ..                  # Move up one directory
cd ~                   # Go to home directory
cd -                   # Go to previous directory

# File and directory operations
mkdir project-name     # Create directory
mkdir -p path/to/dir   # Create nested directories
rmdir directory-name   # Remove empty directory
rm filename           # Delete file
rm -rf directory      # Delete directory and contents (use carefully!)
cp source destination # Copy files
mv source destination # Move/rename files

File Content Operations:

# Viewing file contents
cat filename          # Display entire file
less filename         # View file page by page
head filename         # Show first 10 lines
tail filename         # Show last 10 lines
tail -f filename      # Follow file changes in real-time

# Creating and editing files
touch filename        # Create empty file
nano filename         # Edit with nano editor
vim filename          # Edit with vim editor
echo "content" > file # Write content to file
echo "content" >> file # Append content to file

File Permissions and Ownership:

# Understanding permissions (rwxrwxrwx = user/group/others)
ls -l                 # View detailed file permissions
chmod 755 filename    # Set permissions (rwxr-xr-x)
chmod +x script.sh    # Make file executable
chown user:group file # Change file ownership

Advanced Git Techniques

Rebasing and Merging: Strategies for Clean History and Resolving Conflicts

Interactive Rebasing:

# Interactive rebase to clean up commits
git rebase -i HEAD~3  # Rebase last 3 commits
git rebase -i main    # Rebase onto main branch

# Rebase options in interactive mode:
# pick = use commit as-is
# reword = use commit but edit message
# edit = use commit but stop for amending
# squash = combine with previous commit
# fixup = like squash but discard commit message
# drop = remove commit

Merge Strategies:

# Fast-forward merge (clean, linear history)
git merge feature-branch

# No fast-forward merge (preserves branch context)
git merge --no-ff feature-branch

# Squash merge (combines all commits into one)
git merge --squash feature-branch
git commit -m "Add feature X"

Conflict Resolution:

# When conflicts occur during merge/rebase
git status                    # See conflicted files
git diff                      # View conflicts
# Edit files to resolve conflicts, then:
git add resolved-file.js
git rebase --continue         # Continue rebase
git merge --continue          # Continue merge

# Abort if needed
git rebase --abort
git merge --abort

Bisect and Reflog: Tools for Debugging and History Traversal

Git Bisect for Bug Hunting:

# Start bisect session
git bisect start
git bisect bad                # Current commit is bad
git bisect good v1.0.0        # Known good commit

# Git will checkout middle commit for testing
# After testing:
git bisect good               # If commit is good
git bisect bad                # If commit is bad

# Git continues until bug is found
git bisect reset              # End bisect session

Reflog for History Recovery:

# View reflog (local history of HEAD)
git reflog
git reflog --oneline

# Recover lost commits
git reset --hard HEAD@{2}     # Go back to specific reflog entry
git cherry-pick abc123        # Recover specific commit

# Reflog for branches
git reflog show feature-branch

Hooks and Automation: Customizing Git Workflow

Common Git Hooks:

# Hook locations: .git/hooks/
# Make hooks executable: chmod +x .git/hooks/hook-name

# Pre-commit hook (runs before commit)
#!/bin/sh
# .git/hooks/pre-commit
npm run lint
npm run test

Pre-push Hook Example:

#!/bin/sh
# .git/hooks/pre-push
protected_branch='main'
current_branch=$(git symbolic-ref HEAD | sed -e 's,.*/\(.*\),\1,')

if [ $protected_branch = $current_branch ]; then
    echo "Direct push to main branch is not allowed"
    exit 1
fi

NPM Mastery

Scripting and Automation: Writing Efficient NPM Scripts

Package.json Scripts Section:

{
  "scripts": {
    "start": "node server.js",
    "dev": "nodemon server.js",
    "build": "webpack --mode=production",
    "build:dev": "webpack --mode=development",
    "test": "jest",
    "test:watch": "jest --watch",
    "test:coverage": "jest --coverage",
    "lint": "eslint src/",
    "lint:fix": "eslint src/ --fix",
    "clean": "rimraf dist/",
    "prebuild": "npm run clean",
    "postbuild": "npm run test",
    "deploy": "npm run build && gh-pages -d dist"
  }
}

Advanced NPM Script Techniques:

# Running scripts
npm run build                 # Run build script
npm start                     # Shorthand for npm run start
npm test                      # Shorthand for npm run test

# Passing arguments to scripts
npm run build -- --watch     # Pass --watch to build script
npm run test -- --verbose    # Pass --verbose to test

# Running multiple scripts
npm run lint && npm run test  # Sequential execution
npm run lint & npm run test   # Parallel execution

# Cross-platform scripts using npm packages
npm install --save-dev cross-env rimraf
# Then use in package.json:
"build": "cross-env NODE_ENV=production webpack",
"clean": "rimraf dist/"

Dependency Management: Handling Version Conflicts, Updating Packages

Understanding Package Versions:

# Semantic versioning (MAJOR.MINOR.PATCH)
# ^1.2.3 = Compatible within major version (>=1.2.3 <2.0.0)
# ~1.2.3 = Compatible within minor version (>=1.2.3 <1.3.0)
# 1.2.3 = Exact version

# Install specific versions
npm install lodash@4.17.21    # Exact version
npm install lodash@^4.17.0    # Compatible version
npm install lodash@latest     # Latest version

Dependency Management Commands:

# View dependency tree
npm list                      # Local dependencies
npm list -g                   # Global dependencies
npm list --depth=0            # Top-level only

# Check for outdated packages
npm outdated                  # Show outdated packages
npm audit                     # Security audit
npm audit fix                 # Fix security issues

# Update packages
npm update                    # Update all packages
npm update package-name       # Update specific package
npm install package@latest    # Force latest version

Lock Files and Reproducible Builds:

# Package-lock.json ensures reproducible installs
npm ci                        # Clean install from lock file
npm install --frozen-lockfile # Install without modifying lock file

# Cleaning and troubleshooting
npm cache clean --force       # Clear npm cache
rm -rf node_modules package-lock.json
npm install                   # Fresh install

NPM vs Yarn: Comparing Package Managers

NPM vs Yarn Command Comparison:

# Installation
npm install          # Yarn: yarn install
npm install package  # Yarn: yarn add package
npm install -D pkg   # Yarn: yarn add -D package
npm install -g pkg   # Yarn: yarn global add package

# Running scripts
npm run script       # Yarn: yarn script
npm start           # Yarn: yarn start
npm test            # Yarn: yarn test

# Other operations
npm outdated        # Yarn: yarn outdated
npm audit           # Yarn: yarn audit
npm publish         # Yarn: yarn publish

Key Differences:

  • Performance: Yarn traditionally faster due to parallel downloads and caching
  • Security: Both have audit capabilities, Yarn checks integrity by default
  • Lock Files: Both use lock files (package-lock.json vs yarn.lock)
  • Workspaces: Both support monorepo workspaces
  • Offline Mode: Yarn has better offline capabilities

Automating with Gulp

Setting Up Gulp: Basic Setup and Configuration

Initial Gulp Setup:

# Install Gulp CLI globally
npm install -g gulp-cli

# Install Gulp locally in project
npm install --save-dev gulp

# Create gulpfile.js
touch gulpfile.js

Basic Gulpfile Structure:

// gulpfile.js
const gulp = require('gulp');
const sass = require('gulp-sass')(require('sass'));
const uglify = require('gulp-uglify');
const concat = require('gulp-concat');
const cleanCSS = require('gulp-clean-css');

// Define paths
const paths = {
  styles: {
    src: 'src/scss/**/*.scss',
    dest: 'dist/css/'
  },
  scripts: {
    src: 'src/js/**/*.js',
    dest: 'dist/js/'
  }
};

// Export tasks
exports.styles = styles;
exports.scripts = scripts;
exports.watch = watch;
exports.build = gulp.series(styles, scripts);
exports.default = exports.build;

Common Tasks: Examples like Minification, Concatenation, and Image Optimization

CSS Processing Task:

function styles() {
  return gulp.src(paths.styles.src)
    .pipe(sass().on('error', sass.logError))
    .pipe(cleanCSS())
    .pipe(gulp.dest(paths.styles.dest));
}

JavaScript Processing Task:

function scripts() {
  return gulp.src(paths.scripts.src)
    .pipe(concat('main.js'))
    .pipe(uglify())
    .pipe(gulp.dest(paths.scripts.dest));
}

Image Optimization Task:

const imagemin = require('gulp-imagemin');

function images() {
  return gulp.src('src/images/**/*')
    .pipe(imagemin([
      imagemin.gifsicle({interlaced: true}),
      imagemin.mozjpeg({quality: 75, progressive: true}),
      imagemin.optipng({optimizationLevel: 5}),
      imagemin.svgo({
        plugins: [
          {removeViewBox: true},
          {cleanupIDs: false}
        ]
      })
    ]))
    .pipe(gulp.dest('dist/images'));
}

Watch Task for Development:

function watch() {
  gulp.watch(paths.styles.src, styles);
  gulp.watch(paths.scripts.src, scripts);
  gulp.watch('src/images/**/*', images);
}

Optimizing Build Process: Streamlining Tasks for Efficiency

Parallel Task Execution:

// Run tasks in parallel for better performance
const build = gulp.parallel(styles, scripts, images);
const dev = gulp.series(build, watch);

exports.build = build;
exports.dev = dev;

Conditional Processing:

const gulpif = require('gulp-if');
const sourcemaps = require('gulp-sourcemaps');

const isProduction = process.env.NODE_ENV === 'production';

function styles() {
  return gulp.src(paths.styles.src)
    .pipe(gulpif(!isProduction, sourcemaps.init()))
    .pipe(sass().on('error', sass.logError))
    .pipe(gulpif(isProduction, cleanCSS()))
    .pipe(gulpif(!isProduction, sourcemaps.write()))
    .pipe(gulp.dest(paths.styles.dest));
}

Bash Scripting Essentials

Script Basics: Writing and Executing Scripts

Creating Your First Script:

#!/bin/bash
# deploy.sh - Simple deployment script

echo "Starting deployment..."

# Variables
PROJECT_DIR="/var/www/myproject"
BACKUP_DIR="/var/backups/myproject"
DATE=$(date +%Y%m%d_%H%M%S)

# Create backup
echo "Creating backup..."
cp -r $PROJECT_DIR $BACKUP_DIR/backup_$DATE

# Deploy new code
echo "Deploying new code..."
git pull origin main
npm install
npm run build

echo "Deployment complete!"

Making Scripts Executable:

chmod +x deploy.sh        # Make executable
./deploy.sh              # Run script
bash deploy.sh           # Alternative way to run

Script Arguments and Parameters:

#!/bin/bash
# script-with-args.sh

echo "Script name: $0"
echo "First argument: $1"
echo "Second argument: $2"
echo "All arguments: $@"
echo "Number of arguments: $#"

# Usage: ./script-with-args.sh arg1 arg2

Useful Commands: Loops, Conditionals, and Input Handling

Conditional Statements:

#!/bin/bash

# If-else statement
if [ "$1" = "production" ]; then
    echo "Deploying to production"
    NODE_ENV=production
elif [ "$1" = "staging" ]; then
    echo "Deploying to staging"
    NODE_ENV=staging
else
    echo "Unknown environment: $1"
    exit 1
fi

# File/directory checks
if [ -f "package.json" ]; then
    echo "Node.js project detected"
fi

if [ -d "node_modules" ]; then
    echo "Dependencies already installed"
else
    npm install
fi

Loops:

#!/bin/bash

# For loop with array
ENVIRONMENTS=("development" "staging" "production")
for env in "${ENVIRONMENTS[@]}"; do
    echo "Processing $env environment"
    # Deploy to $env
done

# For loop with file list
for file in *.js; do
    echo "Processing $file"
    # Process JavaScript file
done

# While loop
counter=1
while [ $counter -le 5 ]; do
    echo "Iteration $counter"
    ((counter++))
done

Input Handling:

#!/bin/bash

# Reading user input
read -p "Enter environment (dev/prod): " environment
read -s -p "Enter password: " password  # Silent input
echo

# Input validation
case $environment in
    dev|development)
        echo "Deploying to development"
        ;;
    prod|production)
        echo "Deploying to production"
        ;;
    *)
        echo "Invalid environment"
        exit 1
        ;;
esac

Real-World Scripts: Practical Examples for Automation

Project Setup Script:

#!/bin/bash
# setup-project.sh

PROJECT_NAME=$1

if [ -z "$PROJECT_NAME" ]; then
    read -p "Enter project name: " PROJECT_NAME
fi

echo "Setting up $PROJECT_NAME..."

# Create project structure
mkdir -p $PROJECT_NAME/{src,dist,tests}
cd $PROJECT_NAME

# Initialize package.json
npm init -y

# Install common dependencies
npm install --save-dev webpack webpack-cli eslint prettier

# Create basic files
echo "console.log('Hello, $PROJECT_NAME!');" > src/index.js
echo "# $PROJECT_NAME" > README.md

# Initialize Git
git init
echo "node_modules/" > .gitignore
git add .
git commit -m "Initial commit"

echo "Project $PROJECT_NAME setup complete!"

Backup Script:

#!/bin/bash
# backup.sh

SOURCE_DIR="/var/www/html"
BACKUP_DIR="/var/backups/web"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="backup_$DATE.tar.gz"

# Create backup directory if it doesn't exist
mkdir -p $BACKUP_DIR

# Create compressed backup
tar -czf $BACKUP_DIR/$BACKUP_FILE -C $SOURCE_DIR .

# Keep only last 7 backups
cd $BACKUP_DIR
ls -t backup_*.tar.gz | tail -n +8 | xargs -r rm

echo "Backup created: $BACKUP_FILE"

SSH for Secure Remote Development

Key Management: Creating and Using SSH Keys

Generating SSH Keys:

# Generate new SSH key pair
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
ssh-keygen -t ed25519 -C "your_email@example.com"  # More secure option

# Generate with custom filename
ssh-keygen -t rsa -f ~/.ssh/id_rsa_custom

# Generate without passphrase (less secure but convenient)
ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa_auto

SSH Key Management:

# View public key
cat ~/.ssh/id_rsa.pub

# Copy public key to clipboard (macOS)
pbcopy < ~/.ssh/id_rsa.pub

# Copy public key to clipboard (Linux)
xclip -selection clipboard < ~/.ssh/id_rsa.pub

# Add key to SSH agent
ssh-add ~/.ssh/id_rsa
ssh-add -l                    # List added keys
ssh-add -D                    # Remove all keys from agent

SSH Config File:

# ~/.ssh/config
Host production
    HostName prod.example.com
    User ubuntu
    IdentityFile ~/.ssh/id_rsa_prod
    Port 2222

Host staging
    HostName staging.example.com
    User deploy
    IdentityFile ~/.ssh/id_rsa_staging

Host github
    HostName github.com
    User git
    IdentityFile ~/.ssh/id_rsa_github

Remote Commands: Executing Commands on Remote Servers

Basic SSH Usage:

# Connect to remote server
ssh user@hostname
ssh -p 2222 user@hostname     # Custom port
ssh production                # Using SSH config alias

# Execute single command
ssh user@hostname 'ls -la'
ssh user@hostname 'cd /var/www && git pull'

# Execute multiple commands
ssh user@hostname 'cd /var/www && git pull && npm install && pm2 restart app'

File Transfer with SCP and RSYNC:

# SCP (Secure Copy)
scp file.txt user@hostname:/remote/path/
scp -r directory/ user@hostname:/remote/path/
scp user@hostname:/remote/file.txt ./local/path/

# RSYNC (More efficient for large transfers)
rsync -av --progress local/ user@hostname:/remote/
rsync -av --exclude='node_modules' ./ user@hostname:/var/www/
rsync -av --delete local/ remote/  # Delete files not in source

Remote Script Execution:

# Execute local script on remote server
ssh user@hostname 'bash -s' < local-script.sh

# Execute script with arguments
ssh user@hostname 'bash -s' < deploy.sh production

# Heredoc for inline scripts
ssh user@hostname << 'EOF'
cd /var/www/myapp
git pull origin main
npm install
pm2 restart myapp
EOF

Tunneling and Port Forwarding: Secure Access to Remote Resources

Local Port Forwarding:

# Forward local port to remote service
ssh -L 8080:localhost:80 user@hostname
# Now localhost:8080 connects to hostname:80

# Access remote database
ssh -L 3306:localhost:3306 user@db-server
# Now you can connect to localhost:3306 to reach remote MySQL

# Multiple port forwards
ssh -L 8080:localhost:80 -L 3306:localhost:3306 user@hostname

Remote Port Forwarding:

# Make local service available on remote server
ssh -R 8080:localhost:3000 user@hostname
# Remote server's port 8080 now forwards to your local port 3000

# Useful for webhook testing
ssh -R 8080:localhost:3000 user@server
# Webhook can reach http://server:8080 to hit your local dev server

Dynamic Port Forwarding (SOCKS Proxy):

# Create SOCKS proxy
ssh -D 1080 user@hostname
# Configure browser to use localhost:1080 as SOCKS proxy

# Background SSH tunnel
ssh -f -N -D 1080 user@hostname
# -f: background, -N: don't execute commands

Command Line Debugging Techniques

Basic Tools: Introduction to tools like curl, netstat, top

CURL for HTTP Debugging:

# Basic HTTP requests
curl https://api.example.com/users
curl -X POST https://api.example.com/users
curl -X PUT https://api.example.com/users/1
curl -X DELETE https://api.example.com/users/1

# Headers and data
curl -H "Content-Type: application/json" \
     -H "Authorization: Bearer token" \
     -d '{"name": "John"}' \
     https://api.example.com/users

# Save response and show headers
curl -v https://example.com          # Verbose output
curl -I https://example.com          # Headers only
curl -o response.json https://api.example.com/data
curl -L https://example.com          # Follow redirects

# Timing and performance
curl -w "@curl-format.txt" https://example.com
# curl-format.txt contains:
#     time_namelookup:  %{time_namelookup}\n
#     time_connect:     %{time_connect}\n
#     time_appconnect:  %{time_appconnect}\n
#     time_pretransfer: %{time_pretransfer}\n
#     time_redirect:    %{time_redirect}\n
#     time_starttransfer: %{time_starttransfer}\n
#     time_total:       %{time_total}\n

NETSTAT for Network Debugging:

# Show all connections
netstat -a                   # All connections
netstat -at                  # TCP connections only
netstat -au                  # UDP connections only
netstat -l                   # Listening ports only

# Show processes using ports
netstat -tulpn               # All listening ports with processes
netstat -tulpn | grep :3000  # Check if port 3000 is in use

# Check specific service
netstat -an | grep :80       # Check web server
netstat -an | grep :22       # Check SSH

TOP and System Monitoring:

# System monitoring
top                          # Real-time system stats
htop                         # Enhanced version of top
top -p PID                   # Monitor specific process

# Process information
ps aux                       # All running processes
ps aux | grep node          # Find Node.js processes
pgrep -f "node server.js"    # Find process by command

# Memory and disk usage
free -h                      # Memory usage
df -h                        # Disk usage
du -sh directory/            # Directory size

Web-Specific Debugging: Analyzing Network Requests, Performance Issues

HTTP Header Analysis:

# Analyze response headers
curl -I https://example.com
# Look for:
# - Cache-Control headers
# - Content-Encoding (gzip, etc.)
# - Security headers (CSP, HSTS, etc.)
# - Server information

# Test CORS
curl -H "Origin: https://mydomain.com" \
     -H "Access-Control-Request-Method: POST" \
     -H "Access-Control-Request-Headers: X-Requested-With" \
     -X OPTIONS \
     https://api.example.com/endpoint

SSL/TLS Debugging:

# Check SSL certificate
openssl s_client -connect example.com:443 -servername example.com

# Check certificate expiration
openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates

# Test specific SSL versions
openssl s_client -tls1_2 -connect example.com:443

DNS Debugging:

# DNS lookup
nslookup example.com
dig example.com

# Check specific record types
dig example.com MX           # Mail records
dig example.com NS           # Name servers
dig example.com TXT          # Text records

# Reverse DNS lookup
dig -x 8.8.8.8

# Trace DNS resolution
dig +trace example.com

Logs Analysis: Working with Access and Error Logs

Web Server Log Analysis:

# Apache/Nginx access logs
tail -f /var/log/apache2/access.log
tail -f /var/log/nginx/access.log

# Error logs
tail -f /var/log/apache2/error.log
tail -f /var/log/nginx/error.log

# Filter logs by status code
grep " 404 " /var/log/nginx/access.log
grep " 5[0-9][0-9] " /var/log/nginx/access.log  # 5xx errors

# Count requests by IP
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr

# Most requested URLs
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -10

Application Log Analysis:

# Node.js/PM2 logs
pm2 logs                     # All application logs
pm2 logs app-name            # Specific app logs
pm2 logs --lines 100         # Last 100 lines

# Filter logs by level
grep "ERROR" /var/log/app.log
grep -E "(ERROR|FATAL)" /var/log/app.log

# Real-time log monitoring with filtering
tail -f /var/log/app.log | grep "ERROR"

Log Rotation and Management:

# Compress old logs
gzip /var/log/nginx/access.log.1

# Archive logs by date
logrotate /etc/logrotate.conf

# Clean up old logs
find /var/log -name "*.log.gz" -mtime +30 -delete

Docker Command Line Usage

Docker CLI Basics: Common Commands and Workflows

Essential Docker Commands:

# Image management
docker images                # List images
docker pull nginx:latest     # Download image
docker rmi image-name        # Remove image
docker build -t myapp .      # Build image from Dockerfile

# Container lifecycle
docker run nginx             # Run container
docker run -d nginx          # Run in background (detached)
docker run -p 8080:80 nginx  # Port mapping
docker run --name web nginx  # Named container
docker run -v /host:/container nginx  # Volume mount

# Container management
docker ps                    # Running containers
docker ps -a                 # All containers
docker stop container-id     # Stop container
docker start container-id    # Start stopped container
docker restart container-id  # Restart container
docker rm container-id       # Remove container

Advanced Docker Operations:

# Interactive containers
docker run -it ubuntu bash   # Interactive terminal
docker exec -it container-id bash  # Execute command in running container

# Environment variables
docker run -e NODE_ENV=production myapp
docker run --env-file .env myapp

# Networking
docker network ls            # List networks
docker network create mynet  # Create network
docker run --network mynet nginx

# Container inspection
docker inspect container-id  # Detailed container info
docker logs container-id     # Container logs
docker logs -f container-id  # Follow logs
docker stats                 # Resource usage

Dockerfiles: Creating and Understanding Dockerfiles

Basic Dockerfile Structure:

# Dockerfile for Node.js application
FROM node:16-alpine

# Set working directory
WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application code
COPY . .

# Expose port
EXPOSE 3000

# Create non-root user
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
USER nextjs

# Start application
CMD ["npm", "start"]

Multi-stage Dockerfile:

# Build stage
FROM node:16-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:16-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
USER node
CMD ["npm", "start"]

Dockerfile Best Practices:

# Use specific versions
FROM node:16.14.2-alpine

# Leverage build cache by copying package.json first
COPY package*.json ./
RUN npm ci

# Use .dockerignore to exclude unnecessary files
# .dockerignore contents:
# node_modules
# .git
# .env

# Multi-line RUN commands for better caching
RUN apt-get update && \
    apt-get install -y curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Use HEALTHCHECK
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

Container Management: Running, Stopping, and Managing Containers

Docker Compose for Multi-Container Applications:

# docker-compose.yml
version: '3.8'

services:
  web:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
    depends_on:
      - database
    volumes:
      - ./logs:/app/logs

  database:
    image: postgres:13
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:6-alpine
    ports:
      - "6379:6379"

volumes:
  postgres_data:

Docker Compose Commands:

# Start services
docker-compose up            # Foreground
docker-compose up -d         # Background
docker-compose up --build    # Rebuild images

# Manage services
docker-compose ps            # Service status
docker-compose logs          # All service logs
docker-compose logs web      # Specific service logs
docker-compose exec web bash # Execute command in service

# Stop and cleanup
docker-compose down          # Stop and remove containers
docker-compose down -v       # Also remove volumes

Production Container Management:

# Resource limits
docker run --memory="512m" --cpus="1.5" myapp

# Restart policies
docker run --restart=always myapp        # Always restart
docker run --restart=unless-stopped myapp # Restart unless manually stopped
docker run --restart=on-failure:3 myapp  # Restart on failure, max 3 attempts

# Health checks and monitoring
docker run --health-cmd="curl -f http://localhost:3000/health" \
           --health-interval=30s \
           --health-timeout=10s \
           --health-retries=3 \
           myapp

# Container cleanup
docker system prune              # Remove unused containers, networks, images
docker system prune -a           # Remove all unused images
docker container prune           # Remove stopped containers
docker image prune              # Remove dangling images
docker volume prune             # Remove unused volumes

# Container backup and restore
docker export container-id > backup.tar
docker import backup.tar myapp:backup

Command Line Version Control

Version Control Systems: Git, SVN Command Line Usage

Advanced Git Commands:

# Repository management
git clone --depth 1 url          # Shallow clone (faster)
git clone --branch dev url        # Clone specific branch
git remote -v                     # View remotes
git remote add upstream url       # Add upstream remote
git remote set-url origin new-url # Change remote URL

# Branch operations
git branch -a                     # List all branches
git branch -r                     # List remote branches
git branch -d branch-name         # Delete local branch
git push origin --delete branch   # Delete remote branch
git branch -m old-name new-name   # Rename branch

# Advanced log and history
git log --oneline --graph --all   # Visual branch history
git log --since="2 weeks ago"     # Commits since date
git log --author="John Doe"       # Commits by author
git log --grep="fix"              # Search commit messages
git log -p filename               # Show changes to specific file
git blame filename                # Show who changed each line

Git Workflow Commands:

# Feature branch workflow
git checkout -b feature/new-login
git add .
git commit -m "Add login functionality"
git push origin feature/new-login
git checkout main
git pull origin main
git merge feature/new-login
git push origin main
git branch -d feature/new-login

# Gitflow workflow
git flow init                     # Initialize gitflow
git flow feature start new-feature
git flow feature finish new-feature
git flow release start 1.0.0
git flow release finish 1.0.0

SVN Commands (for legacy projects):

# Basic SVN operations
svn checkout url local-dir        # Checkout repository
svn update                        # Update working copy
svn add filename                  # Add file to version control
svn commit -m "message"           # Commit changes
svn status                        # Check working copy status
svn diff                         # Show changes
svn log                          # Show commit history
svn revert filename              # Revert changes to file

Branching and Tagging: Best Practices for Branch Management

Branch Management Strategies:

# Git Flow branching model
main          # Production-ready code
develop       # Integration branch for features
feature/*     # Feature development branches
release/*     # Release preparation branches
hotfix/*      # Critical fixes for production

# GitHub Flow (simpler)
main          # Always deployable
feature/*     # Feature branches off main

# Creating and managing branches
git checkout -b feature/user-auth
git push -u origin feature/user-auth  # Set upstream tracking

# Branch protection and policies
git config branch.main.rebase true    # Always rebase when pulling main

Tagging for Releases:

# Lightweight tags
git tag v1.0.0                    # Create tag
git tag                           # List tags
git push origin v1.0.0            # Push specific tag
git push origin --tags            # Push all tags

# Annotated tags (recommended for releases)
git tag -a v1.0.0 -m "Release version 1.0.0"
git show v1.0.0                   # Show tag details

# Semantic versioning tags
git tag v1.0.0                    # Major release
git tag v1.0.1                    # Patch release
git tag v1.1.0                    # Minor release

# Release automation
git tag -a v1.2.0 -m "Release v1.2.0"
git push origin v1.2.0
# Trigger CI/CD pipeline for deployment

Stashing and Cleaning: Managing Uncommitted Changes

Git Stash Operations:

# Basic stashing
git stash                         # Stash current changes
git stash push -m "WIP: login feature"  # Stash with message
git stash list                    # List all stashes
git stash show                    # Show stash contents
git stash pop                     # Apply and remove latest stash
git stash apply                   # Apply stash without removing

# Advanced stashing
git stash push -- filename       # Stash specific file
git stash push -u                 # Include untracked files
git stash push -k                 # Keep changes in working directory
git stash drop stash@{1}          # Delete specific stash
git stash clear                   # Delete all stashes

# Stash branching
git stash branch new-feature stash@{0}  # Create branch from stash

Working Directory Cleanup:

# Clean untracked files
git clean -n                      # Dry run (show what would be deleted)
git clean -f                      # Remove untracked files
git clean -fd                     # Remove untracked files and directories
git clean -fx                     # Remove ignored files too

# Reset operations
git reset --soft HEAD~1           # Undo last commit, keep changes staged
git reset --mixed HEAD~1          # Undo last commit, unstage changes
git reset --hard HEAD~1           # Undo last commit, discard changes

# Checkout operations
git checkout -- filename         # Discard changes to file
git checkout .                    # Discard all changes
git checkout HEAD~1 -- filename   # Restore file from previous commit

Performance Monitoring via CLI

Tools Overview: htop, vmstat, iostat

HTOP - Enhanced Process Viewer:

# Install htop
sudo apt install htop            # Ubuntu/Debian
sudo yum install htop            # CentOS/RHEL
brew install htop                # macOS

# Using htop
htop                             # Launch interactive process viewer
# Key shortcuts in htop:
# F1: Help
# F2: Setup (customize display)
# F3: Search processes
# F4: Filter processes
# F5: Tree view
# F6: Sort by column
# F9: Kill process
# F10: Quit

# Command line options
htop -u username                 # Show processes for specific user
htop -p PID1,PID2               # Monitor specific processes

VMSTAT - Virtual Memory Statistics:

# Basic usage
vmstat                          # Single snapshot
vmstat 5                        # Update every 5 seconds
vmstat 5 10                     # 10 updates, 5 seconds apart

# Understanding vmstat output:
# procs: r (running), b (blocked)
# memory: swpd (swap), free, buff (buffers), cache
# swap: si (swap in), so (swap out)
# io: bi (blocks in), bo (blocks out)
# system: in (interrupts), cs (context switches)
# cpu: us (user), sy (system), id (idle), wa (wait)

# Detailed memory info
vmstat -s                       # Memory statistics summary
vmstat -d                       # Disk statistics
vmstat -p /dev/sda1            # Partition statistics

IOSTAT - I/O Statistics:

# Install iostat (part of sysstat)
sudo apt install sysstat        # Ubuntu/Debian

# Basic usage
iostat                          # Current I/O stats
iostat 5                        # Update every 5 seconds
iostat -x                       # Extended statistics
iostat -h                       # Human readable format

# Monitor specific devices
iostat -x sda                   # Monitor specific disk
iostat -x 5 3                   # 3 reports, 5 seconds apart

# Understanding iostat output:
# %user, %nice, %system, %iowait, %steal, %idle
# Device stats: tps, kB_read/s, kB_wrtn/s, kB_read, kB_wrtn

Real-Time Monitoring: Tracking System and Application Performance

System Resource Monitoring:

# CPU monitoring
top -p $(pgrep node)            # Monitor Node.js processes
watch -n 1 'ps aux --sort=-%cpu | head -20'  # Top CPU consumers

# Memory monitoring
watch -n 5 'free -h'            # Memory usage every 5 seconds
ps aux --sort=-%mem | head -10  # Top memory consumers
pmap -x PID                     # Process memory map

# Disk I/O monitoring
iotop                           # Real-time I/O monitoring
iotop -o                        # Only show active I/O
watch -n 1 'df -h'              # Disk space monitoring

# Network monitoring
iftop                           # Network bandwidth usage
nethogs                         # Network usage by process
ss -tuln                        # Socket statistics

Application-Specific Monitoring:

# Node.js application monitoring
pm2 monit                       # PM2 monitoring dashboard
pm2 show app-name               # Detailed app statistics
node --inspect app.js           # Enable debugging/profiling

# Web server monitoring
apachectl status                # Apache status
nginx -t                        # Nginx configuration test
curl -w "@curl-format.txt" http://localhost/

# Database monitoring
mysqladmin status               # MySQL status
mysqladmin processlist          # Active connections
pg_stat_activity               # PostgreSQL activity (within psql)

Custom Monitoring Scripts:

#!/bin/bash
# system-monitor.sh - Custom monitoring script

LOG_FILE="/var/log/system-monitor.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')

# Function to log with timestamp
log_metric() {
    echo "[$TIMESTAMP] $1: $2" >> $LOG_FILE
}

# CPU usage
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | sed 's/%us,//')
log_metric "CPU_USAGE" "$CPU_USAGE"

# Memory usage
MEM_USAGE=$(free | grep Mem | awk '{printf "%.2f", $3/$2 * 100.0}')
log_metric "MEMORY_USAGE" "$MEM_USAGE%"

# Disk usage
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}')
log_metric "DISK_USAGE" "$DISK_USAGE"

# Load average
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}')
log_metric "LOAD_AVERAGE" "$LOAD_AVG"

# Check if critical processes are running
if ! pgrep nginx > /dev/null; then
    log_metric "ALERT" "Nginx is not running"
fi

if ! pgrep node > /dev/null; then
    log_metric "ALERT" "Node.js applications not found"
fi

Bottleneck Identification: Finding and Resolving Performance Issues

CPU Bottleneck Investigation:

# Identify CPU-intensive processes
top -o %CPU                     # Sort by CPU usage
ps aux --sort=-%cpu | head -10  # Top CPU consumers
perf top                        # Real-time CPU profiling (if available)

# Check CPU frequency and throttling
lscpu                           # CPU information
cat /proc/cpuinfo | grep MHz    # Current CPU frequency
cpupower frequency-info         # CPU frequency details

# Load average analysis
uptime                          # System load
w                              # Who is logged in and load
cat /proc/loadavg              # Detailed load information

Memory Bottleneck Investigation:

# Memory usage analysis
free -h                         # Overall memory usage
cat /proc/meminfo              # Detailed memory information
smem -tk                       # Memory usage by process (if available)

# Swap usage
swapon -s                      # Swap file usage
cat /proc/swaps                # Swap information

# Memory leak detection
valgrind --tool=memcheck --leak-check=full ./app  # For C/C++ apps
node --inspect --heap-prof app.js                 # Node.js heap profiling

I/O Bottleneck Investigation:

# Disk I/O analysis
iostat -x 1                    # Extended I/O stats
iotop -a                       # Accumulated I/O
lsof | grep -E "(REG|DIR)"     # Open files

# Check for I/O wait
vmstat 1                       # Look for high %wa (I/O wait)
sar -u 1 10                    # CPU utilization including I/O wait

# Disk performance testing
dd if=/dev/zero of=testfile bs=1G count=1 oflag=dsync  # Write test
dd if=testfile of=/dev/null bs=1G count=1              # Read test
rm testfile

Network Bottleneck Investigation:

# Network interface statistics
ip -s link                     # Interface statistics
ethtool eth0                   # Interface details
cat /proc/net/dev              # Network device statistics

# Connection analysis
netstat -i                     # Interface statistics
ss -s                          # Socket statistics summary
netstat -an | grep :80 | wc -l # Count HTTP connections

# Bandwidth testing
iperf3 -s                      # Start server (on one machine)
iperf3 -c server-ip            # Test from client
wget --output-document=/dev/null http://speedtest.wdc01.softlayer.com/downloads/test100.zip

Securing Web Projects through CLI

File Permissions: Setting and Understanding File Permissions

Understanding Linux File Permissions:

# Permission notation: rwxrwxrwx (user/group/others)
# r (read) = 4, w (write) = 2, x (execute) = 1
# Common combinations:
# 755 = rwxr-xr-x (owner: rwx, group: r-x, others: r-x)
# 644 = rw-r--r-- (owner: rw-, group: r--, others: r--)
# 600 = rw------- (owner: rw-, group: ---, others: ---)

# View permissions
ls -l                          # Detailed listing with permissions
ls -la                         # Include hidden files
stat filename                  # Detailed file information

Setting File Permissions:

# Numeric method
chmod 755 script.sh            # rwxr-xr-x
chmod 644 config.txt           # rw-r--r--
chmod 600 private.key          # rw-------
chmod 400 secret.txt           # r--------

# Symbolic method
chmod +x script.sh             # Add execute permission
chmod -w file.txt              # Remove write permission
chmod u+x,g-w,o-r file         # User +execute, group -write, others -read
chmod a+r file                 # All users +read

# Recursive permissions
chmod -R 755 /var/www/html     # Apply to directory and contents
find /var/www -type f -exec chmod 644 {} \;  # Files to 644
find /var/www -type d -exec chmod 755 {} \;  # Directories to 755

Web-Specific Permission Settings:

# WordPress/PHP application permissions
find /var/www/html -type f -exec chmod 644 {} \;
find /var/www/html -type d -exec chmod 755 {} \;
chmod 600 wp-config.php        # Sensitive config files

# Node.js application permissions
chmod 644 package.json         # Configuration files
chmod 755 bin/www              # Executable scripts
chmod 600 .env                 # Environment variables
chmod -R 755 node_modules/.bin # Binary executables

# Web server permissions
chown -R www-data:www-data /var/www/html  # Apache/Nginx user
chmod 640 /etc/apache2/sites-available/default  # Config files
chmod 600 /etc/ssl/private/server.key           # Private keys

SSL Certificates: Managing SSL/TLS for Web Security

Generating SSL Certificates:

# Self-signed certificate (development)
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

# Certificate Signing Request (CSR) for CA
openssl req -new -newkey rsa:4096 -keyout private.key -out request.csr -nodes

# Generate private key separately
openssl genrsa -out private.key 4096
openssl req -new -key private.key -out request.csr

Let's Encrypt with Certbot:

# Install Certbot
sudo apt install certbot python3-certbot-apache    # Apache
sudo apt install certbot python3-certbot-nginx     # Nginx

# Obtain certificate
sudo certbot --apache -d example.com               # Apache
sudo certbot --nginx -d example.com                # Nginx
sudo certbot certonly --standalone -d example.com  # Standalone

# Manual certificate renewal
sudo certbot renew
sudo certbot renew --dry-run    # Test renewal

# Auto-renewal setup
sudo crontab -e
# Add: 0 12 * * * /usr/bin/certbot renew --quiet

Certificate Management:

# View certificate details
openssl x509 -in certificate.crt -text -noout
openssl x509 -in certificate.crt -noout -dates     # Expiration dates
openssl x509 -in certificate.crt -noout -issuer    # Certificate issuer

# Test SSL configuration
openssl s_client -connect example.com:443 -servername example.com
nmap --script ssl-enum-ciphers -p 443 example.com

# Certificate chain verification
openssl verify -CAfile ca-bundle.crt certificate.crt

SSL/TLS Configuration:

# Strong SSL configuration for Apache
# /etc/apache2/sites-available/ssl.conf
SSLProtocol -all +TLSv1.2 +TLSv1.3
SSLCipherSuite ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
SSLHonorCipherOrder on
Header always set Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"

# Strong SSL configuration for Nginx
# /etc/nginx/sites-available/ssl.conf
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers on;
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";

Security Audits: Basic Command Line Tools for Security Checking

System Security Auditing:

# Check for security updates
apt list --upgradable           # Ubuntu/Debian
yum check-update               # CentOS/RHEL
dnf check-update               # Fedora

# Scan for open ports
nmap -sS localhost             # TCP SYN scan
nmap -sU localhost             # UDP scan
netstat -tuln                  # List listening ports
ss -tuln                       # Modern alternative to netstat

# Check running services
systemctl list-units --type=service --state=running
ps aux | grep -E "(apache|nginx|mysql|ssh)"

# File integrity monitoring
find /etc -type f -name "*.conf" -exec ls -l {} \;
find /var/www -type f -perm /o+w  # World-writable files (security risk)
find / -perm -4000 2>/dev/null    # SUID files
find / -perm -2000 2>/dev/null    # SGID files

Web Application Security Scanning:

# Basic web server security checks
curl -I https://example.com | grep -E "(Server|X-|Strict)"
curl -v https://example.com 2>&1 | grep -E "(SSL|TLS)"

# Security headers check
curl -I https://example.com
# Look for:
# Strict-Transport-Security
# Content-Security-Policy
# X-Frame-Options
# X-Content-Type-Options
# Referrer-Policy

# Directory traversal check
curl https://example.com/../../../etc/passwd
curl https://example.com/../../../../etc/passwd

# SQL injection basic test
curl "https://example.com/search?q='; DROP TABLE users; --"

Log Analysis for Security:

# Authentication log analysis
grep "Failed password" /var/log/auth.log
grep "Accepted password" /var/log/auth.log
lastlog                        # Last login times
last                          # User login history

# Web server security log analysis
grep -E "(40[0-9]|50[0-9])" /var/log/nginx/access.log  # HTTP errors
grep -E "(\.\./|etc/passwd|cmd=)" /var/log/nginx/access.log  # Attack patterns
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr  # IP frequency

# Fail2Ban monitoring (if installed)
fail2ban-client status
fail2ban-client status sshd

Network Security Monitoring:

# Monitor network connections
watch -n 1 'netstat -tuln'
tcpdump -i eth0 port 80        # Capture HTTP traffic
tcpdump -i eth0 -w capture.pcap # Save to file

# Intrusion detection
chkrootkit                     # Rootkit checker (if installed)
rkhunter --check               # Rootkit hunter (if installed)

# Firewall status
ufw status                     # Ubuntu firewall
iptables -L                    # List iptables rules
firewall-cmd --list-all        # CentOS/RHEL firewall

Text Manipulation and Log Analysis

Essential Commands: Mastery of sed, awk, grep

GREP - Pattern Searching:

# Basic grep usage
grep "pattern" filename
grep -i "pattern" filename      # Case insensitive
grep -v "pattern" filename      # Invert match (exclude)
grep -n "pattern" filename      # Show line numbers
grep -c "pattern" filename      # Count matches

# Advanced grep options
grep -r "pattern" directory/    # Recursive search
grep -l "pattern" *.txt        # List files with matches
grep -L "pattern" *.txt        # List files without matches
grep -A 3 -B 3 "pattern" file  # Show 3 lines after and before match
grep -C 3 "pattern" file       # Show 3 lines of context

# Multiple patterns
grep -E "(error|warning|critical)" logfile
grep -e "pattern1" -e "pattern2" file
fgrep "literal.string" file    # Fixed string search (no regex)

# Practical examples
grep -r "TODO" src/            # Find TODO comments
grep -n "function" *.js        # Find function definitions
ps aux | grep nginx            # Find nginx processes

SED - Stream Editor:

# Basic substitution
sed 's/old/new/' filename              # Replace first occurrence per line
sed 's/old/new/g' filename             # Replace all occurrences
sed 's/old/new/gi' filename            # Case insensitive global replace
sed -i 's/old/new/g' filename          # Edit file in place

# Line operations
sed '3d' filename                      # Delete line 3
sed '1,5d' filename                    # Delete lines 1-5
sed '/pattern/d' filename              # Delete lines matching pattern
sed '10q' filename                     # Quit after line 10 (like head -10)

# Advanced sed operations
sed -n '5,10p' filename                # Print lines 5-10 only
sed 's/^/    /' filename               # Add 4 spaces to beginning of each line
sed 's/[[:space:]]*$//' filename       # Remove trailing whitespace
sed '/^$/d' filename                   # Remove empty lines

# Practical examples
sed 's/http:/https:/g' config.txt      # Update URLs to HTTPS
sed -i 's/localhost/production-server/g' *.conf  # Update server references
sed 's/.*ERROR.*/[REDACTED]/' logfile  # Redact error messages

AWK - Pattern Scanning and Processing:

# Basic awk usage
awk '{print $1}' filename              # Print first column
awk '{print $1, $3}' filename          # Print columns 1 and 3
awk '{print NF, $0}' filename          # Print number of fields and line
awk '{print NR, $0}' filename          # Print line number and line

# Field separators
awk -F: '{print $1}' /etc/passwd       # Use : as field separator
awk -F',' '{print $2}' data.csv        # Process CSV files
awk 'BEGIN{FS=","} {print $1, $2}' file

# Conditional processing
awk '$1 > 100 {print $0}' numbers.txt  # Print lines where first field > 100
awk '/ERROR/ {print $0}' logfile       # Print lines containing ERROR
awk 'NF > 5 {print $0}' file           # Print lines with more than 5 fields

# Advanced awk examples
awk '{sum += $1} END {print sum}' numbers.txt        # Sum first column
awk '{count++} END {print count}' filename           # Count lines
awk '{if($1 > max) max=$1} END {print max}' numbers  # Find maximum value

Regular Expressions: Using Regex for Text Manipulation

Basic Regular Expression Patterns:

# Character classes
[0-9]           # Any digit
[a-z]           # Any lowercase letter
[A-Z]           # Any uppercase letter
[a-zA-Z0-9]     # Any alphanumeric character
\d              # Digit (in some tools)
\w              # Word character
\s              # Whitespace character

# Quantifiers
*               # Zero or more
+               # One or more
?               # Zero or one
{3}             # Exactly 3
{3,5}           # Between 3 and 5
{3,}            # 3 or more

# Anchors and boundaries
^               # Beginning of line
$               # End of line
\b              # Word boundary

Practical Regex Examples:

# Email validation
grep -E '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} emails.txt

# IP address extraction
grep -oE '\b([0-9]{1,3}\.){3}[0-9]{1,3}\b' logfile

# Phone number formats
grep -E '\b\d{3}-\d{3}-\d{4}\b' contacts.txt      # 123-456-7890
grep -E '\(\d{3}\) \d{3}-\d{4}' contacts.txt      # (123) 456-7890

# URL extraction
grep -oE 'https?://[^[:space:]]+' textfile

# Credit card number masking
sed -E 's/[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{4}/****-****-****-\4/g' file

# Date format conversion
sed -E 's/([0-9]{2})\/([0-9]{2})\/([0-9]{4})/\3-\1-\2/g' dates.txt  # MM/DD/YYYY to YYYY-MM-DD

Log File Parsing: Techniques for Efficient Log Analysis

Apache/Nginx Access Log Analysis:

# Common Log Format: IP - - [timestamp] "request" status size
# Combined Log Format adds: "referer" "user-agent"

# Top 10 IP addresses
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -10

# Top 10 requested pages
awk '{print $7}' access.log | sort | uniq -c | sort -nr | head -10

# HTTP status code distribution
awk '{print $9}' access.log | sort | uniq -c | sort -nr

# Bandwidth usage by IP
awk '{bytes[$1] += $10} END {for (ip in bytes) print ip, bytes[ip]}' access.log | sort -k2 -nr

# Requests per hour
awk '{print substr($4, 2, 11)}' access.log | uniq -c

# 404 errors with requested URLs
awk '$9 == 404 {print $7}' access.log | sort | uniq -c | sort -nr

# User agents analysis
awk -F'"' '{print $6}' access.log | sort | uniq -c | sort -nr | head -10

Application Log Analysis:

# Error log analysis
grep -i error application.log | wc -l                # Count errors
grep -i "fatal\|critical" application.log            # Severe errors only
awk '/ERROR/ {error_count++} END {print "Errors:", error_count}' app.log

# Performance analysis from logs
grep "response time" app.log | awk '{sum += $NF; count++} END {print "Avg response time:", sum/count}'

# Session analysis
grep "session" app.log | awk '{print $3}' | sort | uniq | wc -l  # Unique sessions

# Database query analysis
grep -o "SELECT.*FROM [a-zA-Z_]*" app.log | sort | uniq -c | sort -nr

System Log Analysis:

# Authentication logs
grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -nr

# Successful logins
grep "Accepted password" /var/log/auth.log | awk '{print $9, $11}' | sort | uniq

# System startup services
grep "Started" /var/log/syslog | awk '{for(i=6;i<=NF;i++) printf "%s ", $i; print ""}'

# Disk space warnings
grep -i "disk\|space\|full" /var/log/syslog

# Memory issues
grep -i "out of memory\|oom\|killed process" /var/log/syslog

Advanced Log Processing Scripts:

#!/bin/bash
# log-analyzer.sh - Comprehensive log analysis

LOG_FILE=$1
REPORT_FILE="log_report_$(date +%Y%m%d).txt"

if [ -z "$LOG_FILE" ]; then
    echo "Usage: $0 <log_file>"
    exit 1
fi

echo "Log Analysis Report - $(date)" > $REPORT_FILE
echo "=================================" >> $REPORT_FILE
echo "" >> $REPORT_FILE

# Basic statistics
echo "BASIC STATISTICS:" >> $REPORT_FILE
echo "Total lines: $(wc -l < $LOG_FILE)" >> $REPORT_FILE
echo "Date range: $(head -1 $LOG_FILE | awk '{print $4}') to $(tail -1 $LOG_FILE | awk '{print $4}')" >> $REPORT_FILE
echo "" >> $REPORT_FILE

# HTTP status codes
echo "HTTP