more docs
This commit is contained in:
		
							
								
								
									
										274
									
								
								docs/plan/phase0.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										274
									
								
								docs/plan/phase0.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,274 @@ | ||||
| # **Phase 0: Project Setup & Core Types** | ||||
|  | ||||
| *   **Goal**: Initialize the project structure, establish version control and build tooling, define the core data structures (primarily through Protocol Buffers as specified in the RFC), and ensure basic parsing/validation capabilities for initial configuration files. | ||||
| *   **RFC Sections Primarily Used**: Overall project understanding, Section 8.2 (Resource Representation Proto3 & JSON), Section 3 (Resource Model - for identifying initial protos), Section 3.9 (Cluster Configuration - for `cluster.kat`). | ||||
|  | ||||
| **Tasks & Sub-Tasks:** | ||||
|  | ||||
| 1.  **Initialize Git Repository & Go Module** | ||||
|     *   **Purpose**: Establish version control and Go project identity. | ||||
|     *   **Details**: | ||||
|         *   Create the root project directory (e.g., `kat-system`). | ||||
|         *   Navigate into the directory: `cd kat-system`. | ||||
|         *   Initialize Git: `git init`. | ||||
|         *   Create an initial `.gitignore` file. Add common Go and OS-specific ignores (e.g., `*.o`, `*.exe`, `*~`, `.DS_Store`, compiled binaries like `kat-agent`, `katcall`). | ||||
|         *   Initialize Go module: `go mod init github.com/dws-llc/kat-system` (or your chosen module path). | ||||
|     *   **Verification**: | ||||
|         *   `.git` directory exists. | ||||
|         *   `go.mod` file is created with the correct module path. | ||||
|         *   Initial commit can be made. | ||||
|  | ||||
| 2.  **Create Initial Directory Structure** | ||||
|     *   **Purpose**: Lay out the skeleton of the project for organizing code and artifacts. | ||||
|     *   **Details**: Create the top-level directories as outlined in the "Proposed Directory/File Structure" from the previous response: | ||||
|         ``` | ||||
|         kat-system/ | ||||
|         ├── api/ | ||||
|         │   └── v1alpha1/ | ||||
|         ├── cmd/ | ||||
|         │   ├── kat-agent/ | ||||
|         │   └── katcall/ | ||||
|         ├── docs/ | ||||
|         │   └── rfc/ | ||||
|         ├── examples/ | ||||
|         ├── internal/ | ||||
|         ├── pkg/      # (Optional, if you decide to have externally importable library code not part of 'internal') | ||||
|         ├── scripts/ | ||||
|         └── test/ | ||||
|         ```        *   Place the `RFC001-KAT.md` into `docs/rfc/`. | ||||
|     *   **Verification**: Directory structure matches the plan. | ||||
|  | ||||
| 3.  **Define Initial Protocol Buffer Messages (`api/v1alpha1/kat.proto`)** | ||||
|     *   **Purpose**: Create the canonical definitions for KAT resources that will be used for API communication and internal state representation. | ||||
|     *   **Details**: | ||||
|         *   Create `api/v1alpha1/kat.proto`. | ||||
|         *   Define initial messages based on RFC Section 3 and Section 8.2. Focus on data structures, not RPC service definitions yet. | ||||
|         *   **Common Metadata**: | ||||
|             ```protobuf | ||||
|             message ObjectMeta { | ||||
|               string name = 1; | ||||
|               string namespace = 2; | ||||
|               string uid = 3; | ||||
|               int64 generation = 4; | ||||
|               string resource_version = 5; // e.g., etcd ModRevision | ||||
|               google.protobuf.Timestamp creation_timestamp = 6; | ||||
|               map<string, string> labels = 7; | ||||
|               map<string, string> annotations = 8; // For future use | ||||
|             } | ||||
|  | ||||
|             message Timestamp { // google.protobuf.Timestamp might be better | ||||
|               int64 seconds = 1; | ||||
|               int32 nanos = 2; | ||||
|             } | ||||
|             ``` | ||||
|         *   **`Workload` (RFC 3.2)**: | ||||
|             ```protobuf | ||||
|             enum WorkloadType { | ||||
|               WORKLOAD_TYPE_UNSPECIFIED = 0; | ||||
|               SERVICE = 1; | ||||
|               JOB = 2; | ||||
|               DAEMON_SERVICE = 3; | ||||
|             } | ||||
|  | ||||
|             // ... (GitSource, UpdateStrategy, RestartPolicy, Container, VolumeMount, ResourceRequests, GPUSpec, Volume definitions) | ||||
|  | ||||
|             message WorkloadSpec { | ||||
|               WorkloadType type = 1; | ||||
|               // Source source = 2; // Define GitSource, ImageSource, CacheImage | ||||
|               int32 replicas = 3; | ||||
|               // UpdateStrategy update_strategy = 4; | ||||
|               // RestartPolicy restart_policy = 5; | ||||
|               map<string, string> node_selector = 6; | ||||
|               // repeated Toleration tolerations = 7; | ||||
|               Container container = 8; // Define Container fully | ||||
|               repeated Volume volumes = 9; // Define Volume fully (SimpleClusterStorage, HostMount) | ||||
|               // ... other spec fields from workload.kat | ||||
|             } | ||||
|  | ||||
|             message Workload { | ||||
|               ObjectMeta metadata = 1; | ||||
|               WorkloadSpec spec = 2; | ||||
|               // WorkloadStatus status = 3; // Define later | ||||
|             } | ||||
|             ``` | ||||
|             *(Start with core fields and expand. For brevity, not all sub-messages are listed here, but they need to be defined based on `workload.kat` fields in RFC 3.2)* | ||||
|         *   **`VirtualLoadBalancer` (RFC 3.3)**: | ||||
|             ```protobuf | ||||
|             message VirtualLoadBalancerSpec { | ||||
|               // repeated Port ports = 1; | ||||
|               // HealthCheck health_check = 2; | ||||
|               // repeated IngressRule ingress = 3; | ||||
|             } | ||||
|  | ||||
|             message VirtualLoadBalancer { // This might be part of Workload or a separate resource | ||||
|               ObjectMeta metadata = 1; // Name likely matches Workload name | ||||
|               VirtualLoadBalancerSpec spec = 2; | ||||
|             } | ||||
|             ``` | ||||
|             *Consider if this is embedded in `Workload.spec` or a truly separate resource associated by name.* RFC shows it as a separate `*.kat` file, implying separate resource. | ||||
|         *   **`JobDefinition` (RFC 3.4)**: Similar structure, `JobDefinitionSpec` with fields like `schedule`, `completions`. | ||||
|         *   **`BuildDefinition` (RFC 3.5)**: Similar structure, `BuildDefinitionSpec` with fields like `buildContext`, `dockerfilePath`. | ||||
|         *   **`Namespace` (RFC 3.7)**: | ||||
|             ```protobuf | ||||
|             message NamespaceSpec { | ||||
|               // Potentially finalizers or other future spec fields | ||||
|             } | ||||
|  | ||||
|             message Namespace { | ||||
|               ObjectMeta metadata = 1; | ||||
|               NamespaceSpec spec = 2; | ||||
|               // NamespaceStatus status = 3; // Define later | ||||
|             } | ||||
|             ``` | ||||
|         *   **`Node` (Internal Representation - RFC 3.8)**: (This is for Leader's internal state, not a user-defined Quadlet) | ||||
|             ```protobuf | ||||
|             message NodeResources { | ||||
|               string cpu = 1; | ||||
|               string memory = 2; | ||||
|               // map<string, string> custom_resources = 3; // e.g., for GPUs | ||||
|             } | ||||
|  | ||||
|             message NodeStatusDetails { // For status reporting by agent | ||||
|               NodeResources capacity = 1; | ||||
|               NodeResources allocatable = 2; | ||||
|               // repeated WorkloadInstanceStatus workload_instances = 3; | ||||
|               // OverlayNetworkStatus overlay_network = 4; | ||||
|               string condition = 5; // e.g., "Ready", "NotReady" | ||||
|               google.protobuf.Timestamp last_heartbeat_time = 6; | ||||
|             } | ||||
|  | ||||
|             message NodeSpec { // Configuration for a node, some set by leader | ||||
|                 // repeated Taint taints = 1; | ||||
|                 string overlay_subnet = 2; // Assigned by leader | ||||
|             } | ||||
|  | ||||
|             message Node { // Represents a node in the cluster | ||||
|               ObjectMeta metadata = 1; // Name is the unique node name | ||||
|               NodeSpec spec = 2; | ||||
|               NodeStatusDetails status = 3; | ||||
|             } | ||||
|             ``` | ||||
|         *   **`ClusterConfiguration` (RFC 3.9)**: | ||||
|             ```protobuf | ||||
|             message ClusterConfigurationSpec { | ||||
|               string cluster_cidr = 1; | ||||
|               string service_cidr = 2; | ||||
|               int32 node_subnet_bits = 3; | ||||
|               string cluster_domain = 4; | ||||
|               int32 agent_port = 5; | ||||
|               int32 api_port = 6; | ||||
|               int32 etcd_peer_port = 7; | ||||
|               int32 etcd_client_port = 8; | ||||
|               string volume_base_path = 9; | ||||
|               string backup_path = 10; | ||||
|               int32 backup_interval_minutes = 11; | ||||
|               int32 agent_tick_seconds = 12; | ||||
|               int32 node_loss_timeout_seconds = 13; | ||||
|             } | ||||
|  | ||||
|             message ClusterConfiguration { | ||||
|               ObjectMeta metadata = 1; // e.g., name of the cluster | ||||
|               ClusterConfigurationSpec spec = 2; | ||||
|             } | ||||
|             ``` | ||||
|         *   Include `syntax = "proto3";` and appropriate `package` and `option go_package` statements. | ||||
|         *   Import `google/protobuf/timestamp.proto` if used. | ||||
|     *   **Potential Challenges**: Accurately translating all nested YAML structures from Quadlet definitions into Protobuf messages. Deciding on naming conventions. | ||||
|     *   **Verification**: `kat.proto` file is syntactically correct. It includes initial definitions for the key resources. | ||||
|  | ||||
| 4.  **Set Up Protobuf Code Generation (`scripts/gen-proto.sh`, Makefile target)** | ||||
|     *   **Purpose**: Automate the conversion of `.proto` definitions into Go code. | ||||
|     *   **Details**: | ||||
|         *   Install `protoc` (protobuf compiler) and `protoc-gen-go` plugin. Add to `go.mod` via `go get google.golang.org/protobuf/cmd/protoc-gen-go` and `go install google.golang.org/protobuf/cmd/protoc-gen-go`. | ||||
|         *   Create `scripts/gen-proto.sh`: | ||||
|             ```bash | ||||
|             #!/bin/bash | ||||
|             set -e | ||||
|  | ||||
|             PROTOC_GEN_GO=$(go env GOBIN)/protoc-gen-go | ||||
|             if [ ! -f "$PROTOC_GEN_GO" ]; then | ||||
|                 echo "protoc-gen-go not found. Please run: go install google.golang.org/protobuf/cmd/protoc-gen-go" | ||||
|                 exit 1 | ||||
|             fi | ||||
|  | ||||
|             API_DIR="./api/v1alpha1" | ||||
|             OUT_DIR="${API_DIR}/generated" # Or directly into api/v1alpha1 if preferred | ||||
|  | ||||
|             mkdir -p "$OUT_DIR" | ||||
|  | ||||
|             protoc --proto_path="${API_DIR}" \ | ||||
|                    --go_out="${OUT_DIR}" --go_opt=paths=source_relative \ | ||||
|                    "${API_DIR}/kat.proto" | ||||
|  | ||||
|             echo "Protobuf Go code generated in ${OUT_DIR}" | ||||
|             ``` | ||||
|             *(Adjust paths and options as needed. `paths=source_relative` is common.)* | ||||
|         *   Make the script executable: `chmod +x scripts/gen-proto.sh`. | ||||
|         *   (Optional) Add a Makefile target: | ||||
|             ```makefile | ||||
|             .PHONY: generate | ||||
|             generate: | ||||
|             	@echo "Generating Go code from Protobuf definitions..." | ||||
|             	@./scripts/gen-proto.sh | ||||
|             ``` | ||||
|     *   **Verification**: | ||||
|         *   Running `scripts/gen-proto.sh` (or `make generate`) executes without errors. | ||||
|         *   Go files (e.g., `kat.pb.go`) are generated in the specified output directory (`api/v1alpha1/generated/` or `api/v1alpha1/`). | ||||
|         *   These generated files compile if included in a Go program. | ||||
|  | ||||
| 5.  **Implement Basic Parsing and Validation for `cluster.kat` (`internal/config/parse.go`, `internal/config/types.go`)** | ||||
|     *   **Purpose**: Enable `kat-agent init` to read and understand its initial cluster-wide configuration. | ||||
|     *   **Details**: | ||||
|         *   In `internal/config/types.go` (or use generated proto types directly if preferred for consistency): Define Go structs that mirror `ClusterConfiguration` from `kat.proto`. | ||||
|             *   If using proto types: the generated `ClusterConfiguration` struct can be used directly. | ||||
|         *   In `internal/config/parse.go`: | ||||
|             *   `ParseClusterConfiguration(filePath string) (*ClusterConfiguration, error)`: | ||||
|                 1.  Read the file content. | ||||
|                 2.  Unmarshal YAML into the Go struct (e.g., using `gopkg.in/yaml.v3`). | ||||
|                 3.  Perform basic validation: | ||||
|                     *   Check for required fields (e.g., `clusterCIDR`, `serviceCIDR`, ports). | ||||
|                     *   Validate CIDR formats. | ||||
|                     *   Ensure ports are within valid range. | ||||
|                     *   Ensure intervals are positive. | ||||
|             *   `SetClusterConfigDefaults(config *ClusterConfiguration)`: Apply default values as per RFC 3.9 if fields are not set. | ||||
|     *   **Potential Challenges**: Handling YAML unmarshalling intricacies, comprehensive validation logic. | ||||
|     *   **Verification**: | ||||
|         *   Unit tests for `ParseClusterConfiguration`: | ||||
|             *   Test with a valid `examples/cluster.kat` file. Parsed struct should match expected values. | ||||
|             *   Test with missing required fields; expect an error. | ||||
|             *   Test with invalid field values (e.g., bad CIDR, invalid port); expect an error. | ||||
|             *   Test with a file that includes some fields and omits optional ones; verify defaults are applied by `SetClusterConfigDefaults`. | ||||
|         *   An example `examples/cluster.kat` file should be created for testing. | ||||
|  | ||||
| 6.  **Implement Basic Parsing/Validation for Quadlet Files (`internal/config/parse.go`, `internal/utils/tar.go`)** | ||||
|     *   **Purpose**: Enable the Leader to understand submitted Workload definitions. | ||||
|     *   **Details**: | ||||
|         *   In `internal/utils/tar.go`: | ||||
|             *   `UntarQuadlets(reader io.Reader) (map[string][]byte, error)`: Takes a `tar.gz` stream, unpacks it in memory (or temp dir), and returns a map of `fileName -> fileContent`. | ||||
|         *   In `internal/config/parse.go`: | ||||
|             *   `ParseQuadletFile(fileName string, content []byte) (interface{}, error)`: | ||||
|                 1.  Unmarshal YAML content based on `kind` field (e.g., into `Workload`, `VirtualLoadBalancer` generated proto structs). | ||||
|                 2.  Perform basic validation on the specific Quadlet type (e.g., `Workload` must have `metadata.name`, `spec.type`). | ||||
|             *   `ParseQuadletDirectory(files map[string][]byte) (*Workload, *VirtualLoadBalancer, ..., error)`: | ||||
|                 1.  Iterate through files from `UntarQuadlets`. | ||||
|                 2.  Use `ParseQuadletFile` for each. | ||||
|                 3.  Perform cross-Quadlet file validation (e.g., if `build.kat` exists, `workload.kat` must have `spec.source.git`). Placeholder for now, more in later phases. | ||||
|     *   **Potential Challenges**: Handling different Quadlet `kind`s, managing inter-file dependencies. | ||||
|     *   **Verification**: | ||||
|         *   Unit tests for `UntarQuadlets` with a sample `tar.gz` archive containing example Quadlet files. | ||||
|         *   Unit tests for `ParseQuadletFile` for each Quadlet type (`workload.kat`, `VirtualLoadBalancer.kat` etc.) with valid and invalid content. | ||||
|         *   An example Quadlet directory (e.g., `examples/simple-service/`) should be created and tarred for testing. | ||||
|         *   `ParseQuadletDirectory` successfully parses a valid collection of Quadlet files from the tar. | ||||
|  | ||||
| *   **Milestone Verification (Overall Phase 0)**: | ||||
|     1.  Project repository is set up with Go modules and initial directory structure. | ||||
|     2.  `make generate` (or `scripts/gen-proto.sh`) successfully compiles `api/v1alpha1/kat.proto` into Go source files without errors. The generated Go code includes structs for `Workload`, `VirtualLoadBalancer`, `JobDefinition`, `BuildDefinition`, `Namespace`, internal `Node`, and `ClusterConfiguration`. | ||||
|     3.  Unit tests in `internal/config/parse_test.go` demonstrate: | ||||
|         *   Successful parsing of a valid `cluster.kat` file into the `ClusterConfiguration` struct, including application of default values. | ||||
|         *   Error handling for invalid or incomplete `cluster.kat` files. | ||||
|     4.  Unit tests in `internal/config/parse_test.go` (and potentially `internal/utils/tar_test.go`) demonstrate: | ||||
|         *   Successful untarring of a sample `tar.gz` Quadlet archive. | ||||
|         *   Successful parsing of individual Quadlet files (e.g., `workload.kat`, `VirtualLoadBalancer.kat`) into their respective Go structs (using generated proto types). | ||||
|         *   Basic validation of required fields within individual Quadlet files. | ||||
|     5.  All code is committed to Git. | ||||
|     6.  (Optional but good practice) A basic `README.md` is started. | ||||
		Reference in New Issue
	
	Block a user