Problem formulation
We are given a set of protected base models and a deployed suspect API that may have been fine-tuned from one of them. The defender does not see weights, activations, or training logs; only text queries and generated images are available. The goal is to assign a posterior over candidate lineages and make an attribution decision with controlled confidence. For each prompt p and model m, CSF samples multiple generations, maps each image to a semantic label c, and estimates the prompt-conditioned category distribution.



