The "self-model" you're talking about is the agent in the Reinforcement Learning framework. It moves between states in an environment and learns from reward it earns from each action.
Yes, although I wonder if, or how well these self-models develop in practice compared to the world-models. Say, if a 2D-agent has a rectangular body shape, it probably won't develop a high-level representation of that fact unless its actions allow it to perceive it accurately. Purely figuring that out from collisions produced by basic actions (rotate left, rotate right, move forward etc.) seems to be practically infeasible. It has neither sight (observing self-movements) nor self-touch (which would allow it to observe its boundaries and relate it to what it has seen).