A Fast and Correct Gather Algorithm

This document formally describes an optimal algorithm to solve the Maximum Weight Bounded-Size Subtree problem. By leveraging the properties of the Depth-First Search (DFS) sequence (preorder traversal), the algorithm flattens the tree into a 1D array. This transformation reduces the traditional complex tree-based backpack merging process into a standard sequence Dynamic Programming (DP) algorithm, achieving a deterministic time complexity.

Definition 1 (Maximum Weight Bounded-Size Subtree).

Let be a rooted tree with a designated root node . Each node is assigned a real-valued weight . Given a capacity integer (), the objective is to find a subset of vertices such that:

  • .
  • The subgraph induced by is connected in .
  • .

The goal is to maximize the total weight of the selected subset:

Note that the connectivity constraint, combined with the requirement that , implies a strict topological dependency: for any node , its parent must also be in .

The algorithm transforms the tree structure into a one-dimensional array using a Preorder Traversal (DFS sequence). Let .

Let be the sequence of vertices in the order they are visited during a DFS preorder traversal, where . Let denote the total number of nodes in the subtree rooted at .

We define a 2D dynamic programming table . Let be the maximum weight obtained by selecting exactly nodes from the suffix of the DFS sequence , assuming that the dependency path from the root down to the parent of has already been selected.

For a given index and size , we have two choices regarding the current node :

  • Include : We gain its weight , consume 1 unit of capacity, and proceed to evaluate the very next node in the DFS sequence . Thus, the maximal weight is .
  • Exclude : If we do not select , the connectivity dependency dictates that we cannot select any node in the subtree of . Since the subtree of occupies exactly the contiguous block from index to in the preorder sequence, we must skip this entire block. The next available valid node is precisely at index . Thus, the maximal weight is .

Combining these, the optimal sub-structure is given by the Bellman equation:

By definition, the root must be included. Therefore, we do not have the option to exclude it:

For any index , choosing nodes yields a weight of : . For any conceptually invalid state, the table is initialized to .

Theorem 2 (Correctness). The D tables correctly compute the maximum weight subtree satisfying the connectivity and capacity constraints.

Proof. By the properties of DFS preorder traversal, a parent node always appears before any of its descendants. Furthermore, the descendants of form a contiguous subarray in the sequence.

The connectivity constraint dictates that if a node is not selected, no node in its subtree can be selected. The transition explicitly models this: exclusion of node forces the state transition to , completely bypassing all descendants of . Inclusion of evaluates the contiguous node (allowing the selection of its children). Since the root is forcibly included, the subgraph generated by this sequence of decisions is guaranteed to form a single connected component containing the root. By Bellman’s Principle of Optimality, evaluating both choices ensures global topological optimization. ⁠ 

Theorem 3 (Complexity). The proposed algorithm has a time complexity of and a space complexity of .

Proof. The DFS traversal to generate arrays and takes time. The DP table has rows and columns. Each cell is computed in time by comparing at most two previously computed values. Therefore, filling the table takes operations. Finding the maximum value among () takes . Overall time complexity is .

The algorithm strictly maintains an array of size and auxiliary arrays of size . Thus, the memory footprint is bounded by . ⁠