Skip to content

Basically osdev notes; simplekernel is a very minimal operating system made for learning and conceptually understanding the OS theory, HyperVisors, and so on. This apart of my "Understanding of OS theoretical concepts"

License

Notifications You must be signed in to change notification settings

wizardengineer/simplekernel

Repository files navigation

graffiti-creator

Motivation

This was a great learning curve that seemingly helped me foster more theoretical and conceptual ideas surrounding theories/laws on the underlying mysteries of operating systems and kernels. I had presumed, I had an okay idea on how OSs worked. However, creating my own OS was absolutely way more definite compared to just reading about. In my own, honest opinion. This was a way for me to have a better understanding of software I would want to mess with on kernel mode and user mode.

Resource I relied on:

A Table of Contents of Things I've learnt:

Before we begin, even though the kernel and operating system is 32bit. I will be explaining concepts in 64 bit too, evidently one of them being Long Mode.

Modes

  • Real Mode
    The name derived from the idea that it's addresses always correspond to real locations in memory, Physical Memory. In comparsion to other modes, Real Mode is a simple and finite 16-bit mode that is presented in every x86 processors. It was the only available mode proved by early x86 designs CPUs. Until the Intel80{286} Protected mode initially came forth. It's finite in comparsion to it's derivatives or successors Protected and Long Mode due to it having less than 1 MB of RAM available for use. There is no hardware-based memory protection (GDT), no virtual memory, no security mitigrations. Forthermore, don't let the finite size impose a misleading conception upon accessibility of Real Mode, It still has access to 32-bit registers (eax, ebx, ...). Before other modes can be loaded, it initiate some programs first within the Real Mode before getting loaded. Real Mode is considerable the true way of having access to the BIOS and it's low level API functionality.

  • Protected Mode
    This mode is a featured for 32 bit operating systems, runs after Real Mode. It provides a set of features, if set will enable, will increase more fluent and systematic control over software. Such features include virtual memory, paging, and safe multi-tasking. Through the process of the execution of Protected Mode, memory segmentation is not optional and is needed to be set up for Protected Mode. Thanks to the use grub amazing help, I don't need to program my own Protected Mode as it's provided. I gave a simple example of how protected mode would be programmed in assembly protected_mode.asm.

Read up more on Protected Mode in Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1; Chapter 3 *

  • Long Mode
    This mode allows x86_64 architecture computers have access to 64 bit operating systems registers and instructions. The bootloader lies in Real Mode, then the 64 bit kernel checks and switches the CPU to long mode via 64 bit register EFER

Interrupts


  • What are Interrupts
    You can think of Interrupts as being a signal or data that is being sent by a device such as Keyboard, Solid State Drive, Hard Driver, or Mouse and Software that tell's the CPU that an event happened and need to immediately stop what it's currently doing, to proceed to what sent it the interrupt.

    E.g. when you move/click a mouse, the mouse controller will send an interrupt to the Interrupt Controller for the CPU, the CPU attention will immediately go to the mouse interrupt and will proceed to execute a routine (mouse movement or clicking). After the mouse interrupt the CPU will continue doing whatever it was before the interrupt or go manage another interrupt if it has been signal to.

    
            +------------------------+            +------------+
            |   TYPES OF INTERRUPTS  |------------| Exceptions |
            +------------------------+            +------------+
                   /         \
                  /           \
                 /             \ 
     +------------+           +------------+
     |  HARDWARE  |           |  SOFTWARE  |
     | INTERRUPTS |           | INTERRUPTS |
     +------------+           +------------+

  • Interrupt Request (IRQ)
    Interrupt Request (IRQ) or Hardware Interrupts, these type of interrupts are yield externally by the chipset that correspond to the Hardware. Through the course of simplekernel, I've only set up 16 ISR for 16 IRQ (0-15).There are two types of IRQs in common use today.

IRQ Lines, or Pin-based IRQs: These are typically statically routed on the chipset. Wires or lines run from the devices on the chipset to an IRQ controller which serializes the interrupt requests sent by devices, sending them to the CPU one by one to prevent races. In many cases, an IRQ Controller will send multiple IRQs to the CPU at once, based on the priority of the device. An example of a very well known IRQ Controller is the Intel 8259 controller chain, which is present on all IBM-PC compatible chipsets, chaining two controllers together, each providing 8 input pins for a total of 16 usable IRQ signalling pins on the legacy IBM-PC.

Message Based Interrupts: These are signalled by writing a value to a memory location reserved for information about the interrupting device, the interrupt itself, and the vectoring information. The device is assigned a location to which it writes either by firmware or by the kernel software. Then, an IRQ is generated by the device using an arbitration protocol specific to the device's bus. An example of a bus which provides message based interrupt functionality is the PCI Bus. By wiki.osdev - https://wiki.osdev.org/Interrupts



  • Interrupt Service Request (ISR)
    ISR are routines that save the current state of a processor and the set up the approriate registers and segment registers needed for the kernel mode before the C level interrupt handler is called.

  • What are exceptions
    Exceptions are a type of interrupt. These interrupts are generated interally by the CPU. Exceptions are yield by an unexpected event within the CPU.

Descriptor

  • Keywords
    • Entry - The Entry defines a region in memory where to start, along with the limit of region and the access privileges associated with the entry. Access privilege as in telling processor if the OS is running in System (ring 0) or Application (ring 3). It prevents applications or usermode from having access to certain registers/operands and mnemonics. Such as CR registers and cli/sti respectively.

    • Limit - The size of the Segment Descriptor

    • Segment Selector - They're registers that hold the index of the Descriptors

      • to be more explicit, An index is not a selector

      • Things a Segment Register holds:

        1. Access the Descriptor Table have privilegde too, this is called RPL (Request Privilege level) for every register but for the cs is called CPL (Current Privilege Level). They're both serving different purposes, which you can find out in the Intel or AMD manuals.

        2. The table to use for looking into. One table is the GDT the other one is the LDT.

      • An informal rule to conceptually imagine the use of selector: So the informal rule is:

        selector = index + table_to_use + privilege table_to_use + index = descriptor = all the information about the segment of memory to be used

        where the plus sign is not a arithmetic operation

      • Bit field of segment selector register:

      15                                                 3    2        0
      +--------------------------------------------------+----+--------+
      |          Index                                   | TI |   RPL  |
      +--------------------------------------------------+----+--------+
      
      TI = Table Indicator: 0 = GDT, 1 = LDT
      
      The TI specify if the Descriptor were dealing with is a GDT or LDT
      IF(TI == 0) THEN
          ... THE DESCRIPTOR IS A GDT
      ELSEIF(TI == 1) THEN
          ... THE DESCRIPTOR IS A LDT
      

  • What exactly is a Table and a Descriptor
    To simply put it, Descriptor Tables are data structures. You can think of Table as being an array and the Descriptor as being the elements in the Table (the array). The Selector segment holds the index and iterates through the Table in order to point at a Descriptor.

  • Use case of GDT
    Being one of the segment descriptor tables, The Global Descriptor Table (GDT) is a protection measure, data structure that uses a heuristic approach in creating sections or segments (aka Segment Descriptors) that are called entries within areas of memory that'll hold certain characteristics of the privileges that have been assign to that memory region. The characteristics that entries, holds the start of where it'll be in memory, limit which is the size of the entry, and the access privilege of the entry.

GDT is 1:1 with Logical Address, An example of the GDT working with the selector:

<---- Selector ---->    +----- Segment Selector Register
+-------+----+-----+    v
| Index | TI | RPL | = DS
+-------+----+-----+            GDT                        LDT
   |      |             +---------------------+   +---------------------+
   |      +------------>| Null Descriptor     |   | Null Descriptor     |
   |                    +---------------------+   +---------------------+
   |                    | Descriptor 1        |   | Descriptor 1        |
   |                    +---------------------+   +---------------------+
   |                    |                     |   |                     |
   |                    ...     ...    ...   ...  ...     ...    ...   ...
   |                    |                     |
   |                    +---------------------+
   +------------------->| Descriptor K        |
                        +---------------------+
                        |                     |
                        ...     ...    ...   ...

RPL (Request Privilege Level) describes the privilege for accessing the descriptor

We store the all the GDT base (address) and limit (size of our GDT) in the GDTR. The GDTR points to all our GDT Entries in memory, starting from the base. After that, it's then loaded with the lgdt mnemonic:

typedef union _gdt_descriptor
{
  struct
  {
    uint64_t limit_low    : 16;
    uint64_t base_low     : 16;
    uint64_t base_middle  : 8;
    uint64_t access       : 8;
    uint64_t granularity  : 8;
    uint64_t base_high    : 8;
  };
} __attribute__((packed)) gdt_entry_t;



gdt_entry_t gdt_entrys[256];

/* The GDTR (GDT Register) */
struct gdtr
{
    uint16_t limit;
    uint32_t base;
} __attribute__((packed)) gdtr;

...

gdtr.base = &gdt_entrys;
gdtr.limit = (sizeof(gdt_descr) * 256) - 1)

Read up more on it in the AMD64 Architecture Programmer’s Manual, Volume 2, Section 4.7 (pg. 84 - 90{+})



Paging

  • MMU (Memory Management Unit)
    MMU (Memory Management Unit), is a vital hardware that does translation for address logic. It first transforms logical address into linear address, with the magic of the segmenation unit hardware circuit. Then the MMU transforms that linear address into physical address, with the help of the second hardware circuit, paging unit.

reference to this and on onwards. https://notes.shichao.io/utlk/ch2/#paging-in-hardware This was by far the most fun I had, I was extremely excited once I understood it.

  • x86 OS Legacy-paging virtual address with 4KB pages:

  • x86 OS CR4.PAE paging virtual address with 4KB page:

  • x86_64 (ia-32e) OS CR4.LME/CR4.PAE paging virtual address with 4KB page:


• The WP and PG flags in control register CR0 (bit 16 and bit 31, respectively).

• The PSE, PAE, PGE, PCIDE, SMEP, SMAP, and PKE flags in control register CR4 (bit 4, bit 5, bit 7, bit 17, bit 20, bit 21, and bit 22, respectively).

• The LME and NXE flags in the IA32_EFER MSR (bit 8 and bit 11, respectively). • The AC flag in the EFLAGS register (bit 18). By Chapter 4 of Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1; Chapter 4

• The LME and NXE flags in the IA32_EFER MSR (bit 8 and bit 11, respectively). • The AC flag in the EFLAGS register (bit 18). Chapter 4 of Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1; Chapter 4

If CR4.PAE and/or CR4.LME is set to 1, then PSE is completely disregarded.

Commands, if you want to use.

No terminal added

cd smkrnl; make run

Credit - Special Thanks to the OGs:

for the spark of inspiration/support on my continuous effort on this project and for helping me understand certain concepts within kernel/OS development. =)


  • Honorable fam mentions:
    LLE members
    Red Vice members such as Chc4 and Internal

About

Basically osdev notes; simplekernel is a very minimal operating system made for learning and conceptually understanding the OS theory, HyperVisors, and so on. This apart of my "Understanding of OS theoretical concepts"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published