Below are brief explanations of the potentially complex terms and jargon used in the video (English).
- Epochs: one full pass through the entire training dataset during model training.
- Batch / batch size: a subset of training samples processed before updating model weights; batch size = number of samples per update.
- Iteration: one update step (processing one batch).
- Loss function (e.g., sparse categorical crossentropy): a measure of model error; sparse categorical crossentropy is used for multi-class classification when labels are integer-encoded.
- Optimizer (e.g., Adam, stochastic gradient descent - SGD): algorithm that updates model weights to minimize the loss. Adam is an adaptive optimizer; SGD updates using learning rate and computed gradients.
- Metrics (e.g., accuracy): performance measures tracked during training/evaluation; accuracy = fraction of correct predictions.
- Overfitting: when a model learns training data (including noise) too well and performs poorly on new data.
- Dropout: regularization technique that randomly disables a fraction of neurons during training to reduce overfitting.
- Validation set: subset of data used to tune hyperparameters and monitor generalization during training (not used to update weights).
- Test set: held-out data used to evaluate final model performance.
- Data augmentation: generating modified versions of training images (e.g., flip, rotate) to increase dataset variety and improve generalization.
- Normalization (e.g., dividing pixel values by 255): scaling input values to a standard range (commonly 0–1) to stabilize training.
- Grayscale: single-channel image (no color), intensity values only.
- Input shape (e.g., 28×28×1): array dimensions fed into the model — height × width × channels.
- Reshape: change array dimensions (e.g., add channel dimension to images) without altering data.
- Flatten: convert multi-dimensional feature maps into a 1D vector before fully connected layers.
- Convolutional layer (Conv2D): layer that applies filters/kernels to input images to extract local features.
- Kernel / filter size (e.g., 3×3, 7×7): spatial dimensions of the convolutional filter.
- Stride: step size with which the convolution filter moves across the input.
- Padding (e.g., 'same', 'valid'): how borders are handled. 'same' keeps output spatial dimensions equal to input (by padding); 'valid' means no padding.
- Feature map: output channel of a convolutional layer representing detected features.
- Number of filters (channels): number of different kernels in a Conv layer; determines output depth.
- Activation function (e.g., ReLU): non-linear function applied after a layer; ReLU(x)=max(0,x).
- Max pooling (MaxPool2D): downsampling operation that reduces spatial dimensions by taking the maximum in each window (e.g., 2×2).
- Fully connected (Dense) layer: layer where every input node connects to every output node; used for final classification.
- Softmax (implied by categorical outputs): activation that converts logits to class probabilities summing to 1.
- Argmax: operation returning index of the maximum value (used to get predicted class from probabilities).
- One-hot vs. sparse labels: one-hot encodes class as a binary vector; sparse uses integer label per sample.
- Model.compile / fit / evaluate / predict: Keras functions to configure training (compile), train the model (fit), assess on test data (evaluate), and generate predictions (predict).
- Model.save / load (restore): persist trained model to disk (e.g., H5) and later reload for inference or further training.
- History object: returned by fit; contains training/validation loss and metric values per epoch.
- Learning rate: hyperparameter controlling step size in optimizer updates.
- Hyperparameters: model/training settings set before training (e.g., epochs, batch size, learning rate, number of filters).
- Feature extraction: process of transforming raw input (images) into meaningful features used by the classifier.
- Receptive field (informal): the region of the input image that affects a particular feature in deeper layers; increases with stacking convolutions.
If you want concise Arabic equivalents or code-related clarifications (examples of Keras/TensorFlow calls used in the video), say which terms you want translated or exemplified.