APPARATUS AND METHOD FOR TRANSFORMING AUDIO CHARACTERISTICS OF AN AUDIO RECORDING

التفاصيل البيبلوغرافية
العنوان: APPARATUS AND METHOD FOR TRANSFORMING AUDIO CHARACTERISTICS OF AN AUDIO RECORDING
Document Number: 20100235166
تاريخ النشر: September 16, 2010
Appl. No: 12/375792
Application Filed: October 17, 2007
مستخلص: A method of audio processing comprises composing one or more transformation profiles for transforming audio characteristics of an audio recording and then generating for the or each transformation profile, a metadata set comprising transformation profile data and location data indicative of where in the recording the transformation profile data is to be applied; the or each metadata set is then stored in association with the corresponding recording. A corresponding method of audio reproduction comprises reading a recording and a meta-data set associated with that recording from storage, applying transformations to the recording data in accordance with the metadata set transformation profile; and then outputting the transformed recording.
Inventors: Bardino, Daniele Giuseppe (London, GB); Griffiths, Richard James (London, GB)
Assignees: SONY COMPUTER ENTERTAINMENT EUROPE LIMITED (London, GB)
Claim: 1. A method of audio processing comprising the steps of: composing one or more transformation profiles for transforming audio characteristics of an audio recording; generating, for the or each transformation profile, a metadata set comprising respective transformation profile data and location data indicative of where in the recording the transformation profile data is to be applied; and storing the or each metadata set in association with the corresponding recording.
Claim: 2. A method of audio processing according to claim 1 in which a transformation profile comprises at least one sequence of predefined profile elements whose parameters are adjustable by a user.
Claim: 3. A method of audio processing according to claim 2 in which at least some of the predefined profile elements are one selected from the list consisting of— i. uniform alteration of amplitude, pitch or duration; ii. ramp-up change in amplitude or pitch; iii. ramp-down change in amplitude or pitch; iv. peaked change in amplitude or pitch; v. point change in amplitude; and vi. non-linear alteration in duration.
Claim: 4. A method of audio processing according to claim 1 in which a transformation profile comprises at least one user-defined profile.
Claim: 5. A method of audio processing according to claim 1 further comprising a step prior to composing one or more transformation profiles of: identifying locations of speech syllables in the recording.
Claim: 6. A method of audio processing according to claim 5 in which the step of identifying locations of speech syllables in the recording is performed by a hidden Markov model.
Claim: 7. A method of audio processing according to claim 5 in which the step of identifying locations of speech syllables in the recording is performed by a comb filter operable to detect instances of voiced harmonics.
Claim: 8. A method of audio processing according to claim 5 comprising the step of selecting a predefined profile element for use in a transformation profile to be applied to a segment of a recording corresponding to an identified syllable.
Claim: 9. A method of audio processing according to claim 1 comprising the step of arranging recorded dialogue into lines.
Claim: 10. A method of audio processing according to claim 1 comprising the step of constraining the transformation profile to substantially maintain the relative formant structure of speech within the recording upon transformation.
Claim: 11. A method of audio processing according to claim 1 in which the metadata set further comprises at least a first tag indicative of an emotion conveyed by the recording when modified according to the transformation profile of the metadata set.
Claim: 12. A method of audio processing according to claim 10 where a tag indicates one or more selected from the list consisting of: i. an emotion state within a preset list of emotion states; and ii. a value on an scale indicating the positive or negative extent of an emotion state.
Claim: 13. A method of audio processing according to claim 1 comprising the steps of: reading from storage a recording and a meta-data set associated with said recording, in which the meta-data set comprises a transformation profile; applying transformations to the recording data in accordance with said transformation profile; and outputting the transformed recording.
Claim: 14. Audio processing apparatus, comprising: composition means; metadata set generation means; and storage writing means, the audio processing apparatus being operable to carry out the method of claim 1.
Claim: 15. A method of audio reproduction, comprising the steps of: reading from storage a recording and a meta-data set associated with said recording, in which the meta-data set comprises a transformation profile; applying transformations to the recording data in accordance with said transformation profile; and outputting the transformed recording.
Claim: 16. A method of audio reproduction according to claim 15 in which transformations are applied to one or more characteristics of the recording selected from the list consisting of: i. amplitude; ii. pitch; and iii. duration.
Claim: 17. A method of audio reproduction according to claim 15 in which the transformation profile comprises one or more profile elements, wherein at least some of the predefined profile elements are selected from the list consisting of: i. uniform alteration of amplitude, pitch or duration; ii. ramp-up change in amplitude or pitch; iii. ramp-down change in amplitude or pitch; iv. peaked change in amplitude or pitch; v. point change in amplitude; and vi. non-linear alteration in duration.
Claim: 18. A method of audio reproduction according to claim 15 in which a transformation profile comprises at least one user-defined profile.
Claim: 19. A method of audio reproduction according to claim 15 comprising the step of selecting one metadata set based upon a respective emotion tag of the metadata set from among a plurality of metadata sets associated with a recording.
Claim: 20. A method of audio reproduction according to claim 19 in which the emotion tag indicates a specific emotion conveyed by a recording when modified according to the transformation profile of the corresponding metadata set.
Claim: 21. A method of audio reproduction according to claim 19 in which the emotion tag is a value on an emotional scale indicative of degree of positive or negative emotion conveyed in a recording when modified according to the transformation profile of the corresponding metadata set.
Claim: 22. A method of audio reproduction according to claim 15 comprising the step of modifying lip synchronisation of a video game character according to transformation profile data relating to changes in duration when the dialogue being delivered by the video game character is also modified according to said transformation profile data.
Claim: 23. A method of audio reproduction according to claim 15 comprising the step of modifying the facial animation of a video game character according to transformation profile data relating to changes in any or all of amplitude and pitch when the dialogue being delivered by the video game character is also modified according to said transformation profile data.
Claim: 24. A method of audio reproduction according to claim 15 comprising the step of modifying the expression of a video game character according to an emotion tag of a selected metadata set when the dialogue being delivered by the video game character is also modified according to transformation profile data associated with the selected metadata set.
Claim: 25. A method of audio reproduction according to claim 15 comprising the step of altering one or more values of a transformation profile prior to applying transformations to the recording, according to the value of one or more parameters of a video-game outputting the recording.
Claim: 26. A method of audio reproduction according to claim 15 comprising the step of randomly altering one or more values of the transformation profile prior to applying transformations to the recording.
Claim: 27. A method of audio reproduction according to claim 26 in which any or all of i. the degree of random change; and ii. the number of random changes, is dependent upon the duration of game-play from the last re-load of a video-game that is outputting the recording.
Claim: 28. A method of audio reproduction according to claim 15 comprising the step of randomly composing a transformation profile from one or more of the available predefined profile elements.
Claim: 29. A method of audio reproduction according to claim 15 constraining any changes to a transformation profile to substantially maintain the relative formant structure of speech within the recording upon transformation.
Claim: 30. Audio reproduction apparatus, comprising: storage reading means; transformation processing means; and audio output means, the audio reproduction apparatus being operable to carry out the method of claim 13.
Claim: 31. A data carrier comprising computer readable instructions that, when executed by a computer, cause the computer to carry out the method of audio processing according to claim 1.
Claim: 32. A data carrier comprising an audio recording and at least a first metadata set associated with said audio recording, the metadata set being generated by the method of audio processing in accordance with claim 1.
Claim: 33. A data carrier comprising computer readable instructions that, when executed by a computer, cause the computer to carry out the method of audio reproduction according to claim 15.
Claim: 34. A data signal comprising computer readable instructions that, when executed by a computer, cause the computer to carry out the method of audio processing according to claim 1.
Claim: 35. A data signal comprising an audio recording and at least a first metadata set associated with said audio recording, the metadata set being generated by the method of audio processing in accordance with claim 1.
Claim: 36. A data signal comprising computer readable instructions that, when executed by a computer, cause the computer to carry out the method of audio reproduction according to claim 15.
Claim: 37. Audio processing apparatus comprising: a profile composer to compose one or more transformation profiles for transforming audio characteristics of an audio recording; a generator to generate, for the or each transformation profile, a metadata set comprising respective transformation profile data and location data indicative of where in the recording the transformation profile data is to be applied; and a metadata store to store the or each metadata set in association with the corresponding recording.
Claim: 38. Audio reproduction apparatus, comprising: a storage reader to read from storage a recording and a meta-data set associated with said recording, in which the meta-data set comprises a transformation profile; a transformer to apply transformations to the recording data in accordance with said transformation profile; and an output to output the transformed recording.
Current U.S. Class: 704/207
Current International Class: 10
رقم الانضمام: edspap.20100235166
قاعدة البيانات: USPTO Patent Applications