Edge Dance

Create dance routines from music.

music choreography

Tool Information

Primary Task	Music choreographies
Category	media-and-content-creation
Sub Categories	music-composition animation

EDGE: Editable Dance Generation from Music is an AI tool that generates high-quality choreographies from music using music embeddings from the Jukebox model. The tool works by encoding input music into embeddings using a frozen Jukebox model and then using a conditional diffusion model to map the music embeddings to a series of 5-second dance clips. At inference time, temporal constraints are applied to batches of multiple clips to enforce temporal consistency before stitching them into an arbitrary-length full video. The tool supports arbitrary spatial and temporal constraints, making it suitable for various end-user applications, including dances subject to joint-wise constraints, motion in-betweening, and dance continuation. In addition, EDGE has a new Contact Consistency Loss that improves physical realism while keeping sliding intact and avoids unintentional foot sliding, ensuring that generated dances are physically plausible. The tool has been trained with physical realism in mind and has been shown to outperform previous work, as indicated by human raters' strong preference for dances generated by EDGE. Overall, EDGE: Editable Dance Generation from Music is a powerful AI tool suitable for generating high-quality choreographies from music, with potential applications in various industries, including entertainment and the arts.

Pros
Generates high-quality choreographies Uses music embeddings Frozen Jukebox model encoding Uses conditional diffusion model Enforces temporal consistency Handles multiple 5-second clips Supports arbitrary-length full videos Arbitrary spatial and temporal constraints Suitable for joint-wise constraints Useful for motion in-betweening Supports dance continuation New Contact Consistency Loss Improves physical realism Avoids unintentional foot sliding Physically plausible dance generation Outperforms previous work Highly rated by human raters Applications in entertainment industry Applications in arts industry Creates routines from unseen music Generates lower body from upper body Generates upper body from lower body Start and end with prespecified motions Trainable with specific motion Embeddings from Jukebox model Generates any length dances

Cons
Requires trained Jukebox model Limited to 5-second clips Temporal consistency enforcement needed Heavy reliance on constraints Potential physical realism issues Loss of motion continuity Difficulty handling complex sequences Demanding computational resources Possible foot sliding inaccuracies Limited end-user applications

Frequently Asked Questions

1. What is EDGE: Editable Dance Generation from Music?

EDGE: Editable Dance Generation from Music is an AI tool that generates high-quality choreographies from music. It uses music embeddings from the Jukebox model and a conditional diffusion model to map these music embeddings to a series of 5-second dance clips.

2. How does EDGE generate choreographies from music?

EDGE generates choreographies from music by encoding the input music into embeddings using a frozen Jukebox model. Then, a conditional diffusion model is used to map these music embeddings to a series of 5-second dance clips. Temporal constraints are applied to batches of multiple clips for temporal consistency before they are stitched into an arbitrary-length full video.

3. What is the Jukebox model used for in EDGE?

The Jukebox model in EDGE is used for gaining a broad understanding of music and creating high-quality dances even for in-the-wild music samples. It is used to encode the input music into embeddings.

4. How does the conditional diffusion model in EDGE work?

The conditional diffusion model in EDGE works by learning to map the music embedding into a series of 5-second dance clips. This learning process is triggered once the input music is encoded into embeddings by the Jukebox model.

5. What are the temporal constraints applied by EDGE?

EDGE applies temporal constraints on batches of multiple clips during inference. These constraints enforce temporal consistency before the clips are stitched into an arbitrary-length full video. Temporal constraints are also used in EDGE to generate dances of any length by imposing temporal continuity between batches of multiple sequences.

6. What kind of constraints can EDGE support?

EDGE can support arbitrary spatial and temporal constraints. These can be used to support applications such as arbitrarily long dances by enforcing temporal continuity between batches of multiple sequences, dances subject to joint-wise constraints like lower body generation given upper body motion, or vice versa, In-Betweening motions, and dances that start with a prespecified motion.

7. What is the Contact Consistency Loss in EDGE?

The Contact Consistency Loss in EDGE is a new feature that improves physical realism in the generated dances. It learns when feet should and shouldn't slide. This feature significantly improves physical realism while keeping intentional foot-ground sliding intact, thereby avoiding unintentional foot sliding.

8. How does EDGE ensure the physical realism of the generated dances?

EDGE ensures physical realism of the generated dances with its new Contact Consistency Loss that learns when feet should and shouldn't slide. This significantly improves physical realism while keeping intentional foot-ground contact sliding intact. The tool has been trained with physical realism in mind.

9. In what applications can EDGE be used?

EDGE can be used in various applications including creating arbitrarily long dances, creating dances subject to joint-wise constraints, for motion in-betweening, and for dance continuation. These applications can be useful in various industries like entertainment and arts.

10. What are the joint-wise constraints in EDGE?

Joint-wise constraints in EDGE allow for lower body generation given the motion of the upper body, or vice versa. This can be used for creating dances that require specific movements of either the upper body or the lower body.

11. Can EDGE generate dances for any length of music?

Yes, EDGE can generate dances for any length of music. It does this by imposing temporal constraints on batches of sequences, allowing it to generate dances of any length.

12. How does EDGE perform compared to previous works?

Human raters have shown a strong preference for dances generated by EDGE over those of previous work, indicating an improved performance by EDGE compared to previous methodologies.

13. What kind of dance clips does EDGE use for training?

EDGE is trained on 5-second dance clips. These clips are used to form the temporal sequences that the conditional diffusion model then uses during the process of learning to map the music embedding.

14. How does EDGE handle foot sliding in dance motions?

EDGE handles foot-sliding in dance motions using its new Contact Consistency Loss that learns when feet should and shouldn't slide. It avoids unintentional foot sliding, thereby improving the physical realism of the dances while maintaining intentional foot-ground contact sliding.

15. What is motion in-betweening in the context of EDGE?

Motion in-betweening, in the context of EDGE, refers to the creation of dances that start and end with prespecified motions. It's one of the temporal constraints that can be implemented using EDGE.

16. From where can I get the source code for EDGE?

The source code for EDGE can be accessed from the Github repository through the link: https://github.com/Stanford-TML/EDGE

17. How does EDGE handle dance continuation?

EDGE handles dance continuation by allowing for dances that start with a prespecified motion. This is achieved through its support for temporal constraints which can be tailored to suit the requirements of dance continuation.

18. What is EDGE's approach towards generating upper body from lower body motion and vice versa?

EDGE's approach towards generating upper body from lower body motion and vice versa is accomplished through joint-wise constraints. These constraints allow the generation of either lower body given the upper body motion or vice versa.

19. Can human raters determine the quality of dances generated by EDGE?

Yes, human raters can determine the quality of dances generated by EDGE. They have shown a strong preference for dances generated by EDGE over those of previous work, confirming its ability to generate high-quality dance choreographies.

20. What type of music can EDGE create choreography for?

EDGE can create choreography for a broad range of music. It leverages music embeddings from the powerful Jukebox model, which allows it to understand and create dances even for in-the-wild music samples.