This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

Talking Windows

Exploring New Speech Recognition And Synthesis APIs In Windows Vista

  • Robert Brown

This article is based on a prerelease version of WinFX. All information contained herein is subject to change.

This article discusses:

This article uses the following technologies:
Windows Vista, WinFX

Elements of Speech Talking to Windows Vista Windows Vista Speech APIs System.Speech.Synthesis System.Speech.Recognition Telephony Applications Conclusion

Microsoft has been researching and developing speech technologies for over a decade. In 1993, the company hired Xuedong (XD) Huang, Fil Alleva, and Mei-Yuh Hwang—three of the four people responsible for the Carnegie Mellon University Sphinx-II speech recognition system, which achieved fame in the speech world in 1992 due to its unprecedented accuracy. Right from the start, with the formation of the Speech API (SAPI) 1.0 team in 1994, Microsoft was driven to create a speech technology that was both accurate and accessible to developers through a powerful API. The team has continued to grow and over the years has released a series of increasingly powerful speech platforms.

In recent years, Microsoft has placed an increasing emphasis on bringing speech technologies into mainstream usage. This focus has led to products such as Speech Server, which is used to implement speech-enabled telephony systems, and Voice Command, which allows users to control Windows Mobile® devices using speech commands. So it should come as no surprise that the speech team at Microsoft has been far from idle in the development of Windows Vista™. The strategy of coupling powerful speech technology with a powerful API has continued right through to Windows Vista.

Windows Vista includes a built-in speech recognition user interface designed specifically for users who need to control Windows® and enter text without using a keyboard or mouse. There is also a state-of-the-art general purpose speech recognition engine. Not only is this an extremely accurate engine, but it's also available in a variety of languages. Windows Vista also includes the first of the new generation of speech synthesizers to come out of Microsoft, completely rewritten to take advantage of the latest techniques.

On the developer front, Windows Vista includes a new WinFX® namespace, System.Speech. This allows developers to easily speech-enable Windows Forms applications and apps based on the Windows Presentation Framework. In addition, there's an updated COM Speech API (SAPI 5.3) to give native code access to the enhanced speech capabilities of the platform. For more information on this, see the "New to SAPI 5.3" sidebar.

Elements of Speech

The concept of speech technology really encompasses two technologies: synthesizers and recognizers (see Figure 1 ). A speech synthesizer takes text as input and produces an audio stream as output. Speech synthesis is also referred to as text-to-speech (TTS). A speech recognizer, on the other hand, does the opposite. It takes an audio stream as input, and turns it into a text transcription.

Figure 1 Speech Recognition and Synthesis

A lot has to happen for a synthesizer to accurately convert a string of characters into an audio stream that sounds just as the words would be spoken. The easiest way to imagine how this works is to picture the front end and back end of a two-part system.

The front end specializes in the analysis of text using natural language rules. It analyzes a string of characters to figure out where the words are (which is easy to do in English, but not as easy in languages such as Chinese and Japanese). This front end also figures out details like functions and parts of speech—for instance, which words are proper nouns, numbers, and so forth; where sentences begin and end; whether a phrase is a question or a statement; and whether a statement is past, present, or future tense.

All of these elements are critical to the selection of appropriate pronunciations and intonations for words, phrases, and sentences. Consider that in English, a question usually ends with a rising pitch, or that the word "read" is pronounced very differently depending on its tense. Clearly, understanding how a word or phrase is being used is a critical aspect of interpreting text into sound. To further complicate matters, the rules are slightly different for each language. So, as you can imagine, the front end must do some very sophisticated analysis.

The back end has quite a different task. It takes the analysis done by the front end and, through some non-trivial analysis of its own, generates the appropriate sounds for the input text. Older synthesizers (and today's synthesizers with the smallest footprints) generate the individual sounds algorithmically, resulting in a very robotic sound. Modern synthesizers, such as the one in Windows Vista, utilize a database of sound segments built from hours and hours of recorded speech. The effectiveness of the back end depends on how good it is at selecting the appropriate sound segments for any given input and smoothly splicing them together.

If this all sounds vastly complicated, well, it is. Having these-text- to speech capabilities built into the operating system is very advantageous, as it allows applications to just use this technology. There's no need to go create your own speech engines. As you'll see later in the article, you can invoke all of this processing with a single function call. Lucky you!

Speech recognition is even more complicated than speech synthesis. However, it too can be thought of as having a front end and a back end. The front end processes the audio stream, isolating segments of sound that are probably speech and converting them into a series of numeric values that characterize the vocal sounds in the signal. The back end is a specialized search engine that takes the output produced by the front end and searches across three databases: an acoustic model, a lexicon, and a language model. The acoustic model represents the acoustic sounds of a language, and can be trained to recognize the characteristics of a particular user's speech patterns and acoustic environments. The lexicon lists a large number of the words in the language, along with information on how to pronounce each word. The language model represents the ways in which the words of a language are combined.

Neither of these models is trivial. It's impossible to specify exactly what speech sounds like. And human speech rarely follows strict and formal grammar rules that can be easily defined. An indispensable factor in producing good models is the acquisition of very large volumes of representative data. An equally important factor is the sophistication of the techniques used to analyze that data to produce the actual models.

Of course, no word has ever been said exactly the same way twice, so the recognizer is never going to find an exact match. And for any given segment of sound, there are very many things the speaker could potentially be saying. The quality of a recognizer is determined by how good it is at refining its search, eliminating the poor matches, and selecting the more likely matches. A recognizer's accuracy relies on it having good language and acoustic models, and good algorithms both for processing sound and for searching across the models. The better the models and algorithms, the fewer the errors that are made, and the quicker the results are found. Needless to say, this is a difficult technology to get right.

While the built-in language model of a recognizer is intended to represent a comprehensive language domain (such as everyday spoken English), any given application will often have very specific language model requirements. A particular application will generally only require certain utterances that have particular semantic meaning to that application. Hence, rather than using the general purpose language model, an application should use a grammar that constrains the recognizer to listen only for speech that the application cares about. This has a number of benefits: it increases the accuracy of recognition, it guarantees that all recognition results are meaningful to the application, and it enables the recognition engine to specify the semantic values inherent in the recognized text. Figure 2 shows one example of how these benefits can be put to use in a real-world scenario.

Figure 2** Using Speech Recognition for Application Input **

Talking to Windows Vista

Accuracy is only part of the equation. With the Windows Vista speech recognition technology, Microsoft has a goal of providing an end-to-end speech experience that addresses key features that users need in a built-in desktop speech recognition experience. This includes an interactive tutorial that explains how to use speech recognition technology and helps the user train the system to understand the user's speech.

The system includes built-in commands for controlling Windows—allowing you to start, switch between, and close applications using commands such as "Start Notepad" and "Switch to Calculator." Users can control on-screen interface elements like menus and buttons by speaking commands like "File" and "Open." There's also support for emulating the mouse and keyboard by giving commands such as "Press shift control left arrow 3 times."

Windows Vista speech technology includes built-in dictation capabilities (for converting the user's voice into text) and edit controls (for inserting, correcting, and manipulating text in documents). You can correct misrecognized words by redictating, choosing alternatives, or spelling. For example, "Correct Robot, Robert." Or "Spell it R, O, B, E, R as in rabbit, T as in telephone." You can also speak commands to select text, navigate inside a document, and make edits—for instance, "Select 'My name is,'" "Go after Robert," or "Capitalize Brown."

The user interface is designed to be unobtrusive, yet to keep the user in control of the speech system at all times (see Figure 3 ). You have easy access to the microphone state, which includes a sleeping mode. Text feedback tells the user what the system is doing, and provides instructions to the user. There's also a user interface used for clarifying what the user has said–when the user utters a command that can be interpreted in multiple ways, the system uses this interface to clarify what was intended. Meanwhile, ongoing use allows the underlying models to adapt continually improve accuracy over time.

Figure 3** Speech UI in Windows Vista **

To enable built-in speech functionality, from the Start Menu choose All Programs | Accessories | Accessibility and click Speech Recognition. The first time you do this, the system will step you through the tutorial, where you'll be introduced to some basic commands. You also get the option of enabling background language model adaptation, by which the system will read through your documents and e-mail in the background to adapt the language model to better match the way you express yourself. There are a variety of things the default settings enable. I recommend that you ask the system "what can I say" and then browse the topics.

But you're a developer, so why do you care about all this user experience stuff? The reason this is relevant to developers is that this is default functionality provided by the operating system. This is functionality that your applications will automatically get. The speech technology uses the Windows accessibility interfaces to discover the capabilities of each application; it then provides a spoken UI for each. If a user says the name of an accessible element, then the system will invoke the default function of that element. Hence, if you have built an accessible application, you have by default built a speech-enabled application.

Windows Vista Speech APIs

Windows Vista can automatically speech-enable any accessible application. This is fantastic news if you want to let users control your application with simple voice commands. But you may want to provide a speech-enabled user interface that is more sophisticated or tailored than the generic speech-enabled UI that Windows Vista will automatically supply.

There are numerous examples of why you might need to do this. If, for example, your user has a job that requires her to use her hands at all times. Any time she needs to hold a mouse or tap a key on the keyboard is time that her hands are removed from the job—this may compromise safety or reduce productivity. The same could be true for users who need their eyes to be looking at something other than a computer screen. Or, say your application has a very large number of functions that get lost in toolbar menus. Speech commands can flatten out deep menu structures, offering fast access to hundreds of commands. If your users ever say "that's easier said than done," they may be right.

In Windows Vista, there are two speech APIs:

  • SAPI 5.3 for native applications
  • The System.Speech.Recognition and System.Speech.Synthesis namespaces in WinFX

Figure 4 illustrates how each of these APIs relates to applications and the underlying recognition and synthesis engines.

Figure 4** Speech APIs in Windows Vista **

The speech recognition engine is accessed via SAPI. Even the classes in the System.Speech.Recognition namespaces wrap the functionality exposed by SAPI. (This is an implementation detail of Windows Vista that may change in future releases, but it's worth bearing in mind.) The speech synthesis engine, on the other hand, is accessed directly by the classes in System.Speech.Synthesis or, alternatively, by SAPI when used in an unmanaged application.

Both types implement the SAPI device driver interface (DDI), which is an API that makes engines interchangeable to the layers above them, much like the way device driver APIs make hardware devices interchangeable to the software that uses them. This means that developers who use SAPI or System.Speech are still free to use other engines that implement the SAPI DDI (and many do).

Notice in Figure 4 that the synthesis engine is always instantiated in the same process as the application, but the recognition engine can be instantiated in another process called SAPISVR.EXE. This provides a shared recognition engine that can be used simultaneously by multiple applications. This design has a number of benefits. First, recognizers generally require considerably more run-time resources than synthesizers, and sharing a recognizer is an effective way to reduce the overhead. Second, the shared recognizer is also used by the built-in speech functionality of Windows Vista. Therefore, apps that use the shared recognizer can benefit from the system's microphone and feedback UI. There's no additional code to write, and no new UI for the user to learn.New to SAPI 5.3

SAPI 5.3 is an incremental update to SAPI 5.1. The core mission and architecture for SAPI are unchanged. SAPI 5.3 adds performance improvements, overall enhancements to security and stability, and a variety of new functionality, including:

W3C Speech Synthesis Markup Language SAPI 5.3 supports the W3C Speech Synthesis Markup Language (SSML) version 1.0. SSML provides the ability to mark up voice characteristics, speed, volume, pitch, emphasis, and pronunciation so that a developer can make TTS sound more natural in their application.

W3C Speech Recognition Grammar Specification SAPI 5.3 adds support for the definition of context-free grammars using the W3C Speech Recognition Grammar Specification (SRGS), with these two important constraints: it does not support the use of SRGS to specify dual-tone modulated-frequency (touch-tone) grammars, and it only supports the expression of SRGS as XML—not as Augmented Backus-Naur Form (ABNF).

Semantic Interpretation SAPI 5.3 enables an SRGS grammar to be annotated with JScript® for semantic interpretation, so that a recognition result may contain not only the recognized text, but the semantic interpretation of that text. This makes it easier for apps to consume recognition results, and empower grammar authors to provide a full spectrum of semantic processing beyond what could be achieved with name-value pairs.

User-Specified "Shortcuts" in Lexicons This is the ability to add a string to the lexicon and associate it with a shortcut word. When dictating, the user can say the shortcut word and the recognizer will return the expanded string.

As an example, a developer could create a shortcut for a location so that a user could say "my address" and the actual data would be passed to the application as "123 Smith Street, Apt. 7C, Bloggsville 98765, USA". The following code sets up the lexicon shortcut:

When this code is used, the shortcut is added to the speech lexicon. Every time a user says "my address," the actual address is returned as the transcribed text.

Discovery of Engine Pronunciations SAPI 5.3 enables applications to query the Windows Vista recognition and synthesis engines for the pronunciations they use for particular words. This API will tell the application not only the pronunciation, but how that pronunciation was derived.

System.Speech.Synthesis

Let's take a look at some examples of how to use speech synthesis from a managed application. In the grand tradition of UI output examples, I'll start with an application that simply says "Hello, world," shown in Figure 5 . This example is a bare-bones console application as freshly created in Visual C#®, with three lines added. The first added line simply introduces the System.Speech.Synthesis namespace. The second declares and instantiates an instance of SpeechSynthesizer, which represents exactly what its name suggests: a speech synthesizer. The third added line is a call to SpeakText. This is all that's needed to invoke the synthesizer!

Figure 5 Saying Hello

By default, the SpeechSynthesizer class uses the synthesizer that is nominated as default in the Speech control panel. But it can use any SAPI DDI-compliant synthesizer.

The next example (see Figure 6 ) shows how this can be done, using the old Sam voice from Windows 2000 and Windows XP, and the new Anna and Microsoft® Lili voices from Windows Vista. (Note that this and all remaining System.Speech.Synthesis examples use the same code framework as the first example, and just replaces the body of Main.) This example shows three instances of the SelectVoice method using the name of the desired synthesizer. It also demonstrates the use of the Windows Vista Chinese synthesizer, Lili. Incidentally, Lili also speaks English very nicely.

Figure 6 Hearing Voices

In both of these examples, I use the synthesis API much as I would a console API: an application simply sends characters, which are rendered immediately in series. But for more sophisticated output, it's easier to think of synthesis as the equivalent of document rendering, where the input to the synthesizer is a document that contains not only the content to be rendered, but also the various effects and settings that are to be applied at specific points in the content.

Much like an XHTML document can describe the rendering style and structure to be applied to specific pieces of content on a Web page, the SpeechSynthesizer class can consume an XML document format called Speech Synthesis Markup Language (SSML). The W3C SSML recommendation ( www.w3.org/TR/speech-synthesis ) is very readable, so I'm not going to dive into describing SSML in this article. Suffice it to say, an application can simply load an SSML document directly into the synthesizer and have it rendered. Here's an example that loads and renders an SSML file:

A convenient alternative to authoring an SSML file is to use the PromptBuilder class in System.Speech.Synthesis. PromptBuilder can express almost everything an SSML document can express, and is much easier to use (see Figure 7 ). The general model for creating sophisticated synthesis is to first use a PromptBuilder to build the prompt exactly the way you want it, and then use the Synthesizer's Speak or SpeakAsync method to render it.

Figure 7 Using PromptBuilder

Figure 7 illustrates a number of powerful capabilities of the PromptBuilder. The first thing to point out is that it generates a document with a hierarchical structure. The example uses one speaking style nested within another. At the beginning of the document, I start the speaking style I want used for the entire document. Then about halfway through, I switch to a different style to provide emphasis. When I end this style, the document automatically reverts to the previous style.

The example also shows a number of other handy capabilities. The AppendAudio function causes a WAV file to be spliced into the output, with a textual equivalent to be used if the WAV file can't be found. The AppendTextWithPronunciation function allows you to specify the precise pronunciation of a word. A speech synthesis engine already knows how to pronounce most of the words in general use in a language, through a combination of an extensive lexicon and algorithms for deriving the pronunciation of unknown words. But this won't work for all words, such as some specialized terms or brand names. For example, "WinFX" would probably be pronounced as "winfeks". Instead, I use the International Phonetic Alphabet to describe "WinFX" as "wɪnɛfɛks", where the letter "ɪ" is Unicode character 0x026A (the "i" sound in the word "fish", as opposed to the "i" sound in the word "five") and the letter "ɛ" is Unicode character 0x025B (the General American "e" sound in the word "bed").

In general, a synthesis engine can distinguish between acronyms and capitalized words. But occasionally you'll find an acronym that the engine's heuristics incorrectly deduce to be a word. So you can use the AppendTextWithHint function to identify a token as an acronym. There are a variety of nuances you can introduce with the PromptBuilder. My example is more illustrative than exhaustive.

Another benefit of separating content specification from run-time rendering is that you are then free to decouple the application from the specific content it renders. You can use PromptBuilder to persist its prompt as SSML to be loaded by another part of the application, or a different application entirely. The following code writes to an SSML file with PromptBuilder:

Another way to decouple content production is to render the entire prompt to an audio file for later playback:

Whether to use SSML markup or the PromptBuilder class is probably a matter of stylistic preference. You should use whichever you feel more comfortable with.

One final note about SSML and PromptBuilder is that the capabilities of every synthesizer will be slightly different. Therefore, the specific behaviors you request with either of these mechanisms should be thought of as advisory requests that the engine will apply if it is capable of doing so.

System.Speech.Recognition

While you could use the general dictation language model in an application, you would very rapidly encounter a number of application development hurdles regarding what to do with the recognition results. For example, imagine a pizza ordering system. A user could say "I'd like a pepperoni pizza" and the result would contain this string. But it could also contain "I'd like pepper on a plaza" or a variety of similar sounding statements, depending on the nuances of the user's pronunciation or the background noise conditions. Similarly, the user could say "Mary had a little lamb" and the result would contain this, even though it's meaningless to a pizza ordering system. All of these erroneous results are useless to the application. Hence an application should always provide a grammar that describes specifically what the application is listening for.

In Figure 8 , I've started with a bare-bones Windows Forms application and added a handful of lines to achieve basic speech recognition. First, I introduce the System.Speech.Recognition namespace, and then instantiate a SpeechRecognizer object. Then I do three things in Form1_Load: build a grammar, attach an event handler to that grammar so that I can receive the SpeechRecognized events for that grammar, and then load the grammar into the recognizer. At this point, the recognizer will start listening for speech that fits the patterns defined by the grammar. When it recognizes something that fits the grammar, the grammar's SpeechRecognized event handler is invoked. The event handler itself accesses the Result object and works with the recognized text.

Figure 8 Ordering a Pizza

The System.Speech.Recognition API supports the W3C Speech Recognition Grammar Specification (SRGS), documented at www.w3.org/TR/speech-grammar . The API even provides a set of classes for creating and working with SRGS XML documents. But for most cases, SRGS is overkill, so the API also provides the GrammarBuilder class that suffices nicely for our pizza ordering system.

The GrammarBuilder lets you assemble a grammar from a set of phrases and choices. In Figure 8 I've eliminated the problem of listening for utterances I don't care about ("Mary had a little lamb"), and constrained the engine so that it can make much better choices between ambiguous sounds. It won't even consider the word "plaza" when the user mispronounces "pizza". So in a handful of lines, I've vastly increased the accuracy of the system. But there are still a couple of problems with the grammar.

The approach of exhaustively listing every possible thing a user can say is tedious, error prone, difficult to maintain, and only practically achievable for very small grammars. It is preferable to define a grammar that defines the ways in which words can be combined. Also, if the application cares about the size, toppings, and type of crust, then the developer has quite a task to parse these values out of the result string. It's much more convenient if the recognition system can identify these semantic properties in the results. This is very easy to do with System.Speech.Recognition and the Windows Vista recognition engine.

Figure 9 shows how to use the Choices class to assemble grammars where the user says something from a list of alternatives. In this code, the contents of each Choices instance are specified in the constructor as a sequence of string parameters. But you have a lot of other options for populating Choices: you can iteratively add new phrases, construct Choices from an array, add Choices to Choices to build the complex combinatorial rules that humans understand, or add GrammarBuilder instances to Choices to build increasingly flexible grammars (as demonstrated by the Permutations part of the example).

Figure 9 Using Choices to Assemble Grammars

Figure 9 also shows how to tag results with semantic values. When using GrammarBuilder, you can append Choices to the grammar, and attach a semantic value to that choice, as can be seen in the example with statements like this:

Sometimes a particular utterance will have an implied semantic value that was never uttered. For example, if the user doesn't specify a pizza size, the grammar can specify the size as "regular", as seen with statements like this:

Fetching the semantic values from the results is done by accessing RecognitionEventArgs.Result.Semantics[<name>].

Telephony Applications

This article only touches on the capabilities of the new speech technology included in Windows Vista. Here are several resources to help you learn more:

  • Windows SDK
  • Microsoft Speech Server
  • Windows Accessibility

The following members of the Microsoft speech team have active blogs that you may find useful:

  • Jay Waltmunson
  • Jen Anderson
  • Philipp Schmid
  • Richard Sprague
  • Rob Chambers

One of the biggest growth areas for speech applications is in speech-enabled telephony systems. Many of the principles are the same as with desktop speech: recognition and synthesis are still the keystone technologies, and good grammar and prompt design are critical.

There are a number of other factors necessary for the development of these applications. A telephony application needs a completely different acoustic model. It needs to be able to interface with telephony systems, and because there's no GUI, it needs to manage a spoken dialog with the user. A telephony application also needs to be scalable so that it can service as many simultaneous calls as possible without compromising performance.

Designing, tuning, deploying, and hosting a speech-enabled telephony application is a non-trivial project that the Microsoft Speech Server platform and SDK have been developed to address.

Windows Vista contains a new, more powerful desktop speech platform that is built into the OS. The intuitive UI and powerful APIs make it easy for end users and developers to tap into this technology. If you have the latest Beta build of Windows Vista, you can start playing with these new features immediately.

The Windows Vista Speech Recognition Web site, should be live by the time you're reading this. For links to other sources of information about Microsoft speech technologies, see the "Resources" sidebar.

Robert Brown is a Lead Program Manager on the Microsoft Speech & Natural Language team. Robert joined Microsoft in 1995 and has worked on VOIP and messaging technologies, Speech Server, and the speech platform in Windows Vista. Special thanks to Robert Stumberger, Rob Chambers and other members of the Microsoft speech team for their contribution to this article.

Additional resources

Windows Tools, Help & Guides

Home

  • Free Windows Books
  • Software Guides
  • 8 Customization
  • 8 Maintenance
  • 7 Customization
  • 7 Maintenance
  • Vista Customization
  • Vista Maintenance
  • Vista Security
  • XP Customization
  • XP Maintenance
  • XP Security

Software Reviews

  • Business and Productivity
  • Customization
  • Graphic and Publishing
  • Maintenance
  • Security and Antimalware

Customize Windows

  • Best Freebies
  • Customization Software
  • Desktop Wallpaper
  • Screensavers

About Windows Guides

  • Legal / Privacy
  • Write for Windows Guides
  • Rich Robinson

Follow via RSS

Use Speech Recognition in Vista [How To]

If you have a microphone and a desire to speak to your computer and tell it what to do, this guide is for you. In this guide, you’ll learn how to calibrate Windows Speech Recognition and where to learn how to use it.

Configuring Speech Recognition

Speech recognition configuration is easy; all you need is a microphone and a few minutes to run some tests.

Press the Start button, type speech and click Speech Recognition . If this is the first time you’ve run speech recognition, you’ll see the wizard pop up straight away. If you’ve run it before and didn’t complete the setup, you may have to click on Speech Recognition Options, on the start menu, instead.

Windows Speech Recognition 1

After clicking next, you’ll need to tell Windows what type of microphone you have.

Windows Speech Recognition 2

Now you’ll need to set up your microphone so Windows can hear you accurately.

Windows Speech Recognition 3

Speak into the microphone and try to keep it at a distance where you a speaking at a normal level and the noise levels fall within the green area.

Windows Speech Recognition 4

Once you’ve successfully set up your microphone, you are ready to move on.

Windows Speech Recognition 5

You can improve accuracy by reading a document, while Windows figures out both how you pronounce certain words and what words you merge together when you speak.

Windows Speech Recognition 6

You now have an opportunity to print the speech reference card. This card has commands you can speak; this is useful when you are first learning the potential of Windows Speech Recognition.

Windows Speech Recognition 7

If you plan to use speech recognition frequently, you can choose to run it at Windows startup. You can always change this option later via the Speech Recognition options screen.

Windows Speech Recognition 8

At this step, I advise taking the tutorial; this is where you’ll learn how to become a pro with the software.

Windows Speech Recognition 9

If you need to change any settings, press the Start button, type speech, and click Speech Recognition Options. Here you can configure settings and even run the wizard again. If you need to set up speech recognition for two users, you’ll need to set up a user account for them to use.

Windows Speech Recognition 10

  Share this:

  • Click to print (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on Google+ (Opens in new window)
  • Click to share on Pocket (Opens in new window)

' src=

Rich is the owner and creator of Windows Guides; he spends his time breaking things on his PC so he can write how-to guides to fix them.

  • More Posts (1018)

Continue Reading

  • Copy Error / Warning Dialog Box Text to Clipboard [Quick Tip]
  • Capture Screenshots Using built-in Windows Tools [How To]
  • Get Your Computer to Say What You Type [How To] [Updated]
  • Avoid the Majority of Viruses by Disabling Autorun [Quick Tip]

Free PC tips by email

Search Windows Guides

30 thoughts on “use speech recognition in vista [how to]”.

' src=

I have given the speech recognition a go on Vista. I must admit, at first, it is awesome. opening and closing windows, scrolling through your start menu, all manner of everyday mouse and keyboard tasks even dictating a letter in MS Word. As fascinating and fun as this is at first, the practical application of controlling your pc by voice command to any achievable productive level is questionable and therefore this feature should remain just that…….a fun interaction, a party trick for when your kids are on the computer again and you walk over and halt their internet experience with two words “Close Firefox”. My only gripe with voice recognition, apart from its apparent productivity limitations is that it gets easily confused with Scottish accents! lol.

' src=

I must have a very different voice because these voice recognition things ALWAYS struggle to understand what I’m trying to say…then again so do people…hmm…

' src=

Hmm speech recognition doesn’t always understand me, I guess I’ll try it again.

' src=

I was wondering if there is a way to activate Speech Recognition in non-English (in my case, Russian) Windows Vista. Please post here if you know a workaround.

' src=

You’ll need Windows Vista Ultimate or Enterprise:

In Windows Vista, Windows Speech Recognition works in the current language of the OS. That means that in order to use another language for speech recognition, you have to have the appropriate language pack installed. Language packs are available as free downloads through Windows Update for the Ultimate and Enterprise versions of Vista. Once you have the language installed, you’ll need to change the display language of the OS to the language you want to use. Both of these are options on the “Regional and Language Options” control panel. You can look in help for “Install a display language” or “Change the display language”.

' src=

That’s the problem. I have Home Premium.

' src=

Hey this is really hitech, like this so much. Awesome post.

' src=

Ah speech recognition.. we spend a lot of time configuring it and a Windows re-install due to an event, and we lost all our hardward.

Luckily for us, Microsoft released a backup system for the speech recognition avoiding us a full recalibration. Here is the link: http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=1d60a5a6-85d4-4db2-a581-a41f66561a7d

' src=

Thanks for the heads up and link!

Comments are closed.

See which sites have been visited on your PC (even if private browsing mode is used)

Create a Windows 7 System Repair Disc

Best Free Anti-malware

Hibernate vs. Sleep vs. Shut-Down

i3, i5, and i7; Dual, Quad, Hexa Core Processors. How to they Differ?

Intel's Ivy Bridge Processor: new Features

Scott Hanselman

Speech recognition in windows vista - i'm listening.

windows vista speech recognition

Of course I was excited to hear that Windows Vista would include lots of new speech recognition features, and today I finally got to try them out.  I plugged in my Logitech USB headset and ran through the tutorial.

You really have to try it to fully understand the improvements that have been made to accessibility in Windows Vista.  While this entire blog post was dictated using the Built-in speech features in Vista, the dictation features, frankly aren't that impressive.  To be clear, they work, and they work well.  But it's the interface, the user experience, that's so amazing.

windows vista speech recognition

But these are speech-specific things, what was really interesting to me is how easy it is to interact with the entire system, the shell, without touching your mouse.  This is going to be Huge for people who CAN'T touch the mouse.

One of the most clever user interface experiences is the "show numbers" interface. When you're using Windows Vista voice recognition and you tell it to "show numbers," the current window has numbered regions overlaid on a user interface elements, so that they can be easily selected just by saying a number.

For example, notice the interface of Windows Live Writer as seen below.  Even though the default interface will click when I say - meaning if I simply say "insert picture" the system will click the Insert Picture user interface element just because it's on the screen - if there's a user interface on it like a toolbar button or something that is difficult to express verbally, I can click it easily using show numbers.

windows vista speech recognition

The same feature is used when selecting words that appear multiple times within a chunk of text.  For example if a paragraph contained the name 'Hanselman' four times and I said "Select Hanselman," each instance of the word would have been numbered overlaid allowing me to quickly indicate the one I meant. 

I'm not familiar with the Windows Speech API, but it'll be interesting to see how vendors like the folks at Dragon Naturally Speaking are meant to integrate their speech recognition algorithms to the existing interface experience provided by Vista out of the box.

As the one who fortunately does have the use of both my hands, I find speech to be the most valuable when I can have one hand on the keyboard, one hand on the mouse, and be speaking simultaneously.  It's certainly true that I can talk faster than I can type, and it's very very difficult to beat really good speech recognition software by just typing. 

It's worth noting that they've removed all of the speech recognition features from Office 2007 and there are a number of people who were considerably torqued about that decision.  That said, if you're into speech recognition or you use speech recognition software in your everyday life, the improvements in a speech in Vista are reason enough to upgrade your OS.

And sure, it's not perfect, but I'm using a crappy microphone in a noisy room on a slowish machine while speaking quietly so as not to wake the baby.  Not too shabby.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook

Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

windows vista speech recognition

  • Generative AI
  • Office Suites
  • Collaboration Software
  • Productivity Software
  • Augmented Reality
  • Emerging Technology
  • Remote Work
  • Artificial Intelligence
  • Operating Systems
  • IT Leadership
  • IT Management
  • IT Operations
  • Cloud Computing
  • Computers and Peripherals
  • Data Center
  • Enterprise Applications
  • Vendors and Providers
  • Enterprise Buyer’s Guides
  • United States
  • Netherlands
  • United Kingdom
  • New Zealand
  • Newsletters
  • Foundry Careers
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Copyright Notice
  • Member Preferences
  • About AdChoices
  • E-commerce Affiliate Relationships
  • Your California Privacy Rights

Our Network

  • Network World

lamont_wood

How to use Windows’ built-in speech recognition

Talking to your windows pc instead of typing can substantially boost your productivity — if you know the right way to do it..

man computer headset speech recognition

It’s already built into your Windows PC. All you have to do is acquire one small piece of hardware, load the software, and learn to use it. After that, inputting text can become almost effortless, potentially enhancing your productivity substantially.

I’m talking about Windows Speech Recognition (WSR), one of the least-heralded features of Microsoft Windows. As it turns out, Microsoft has been offering speech recognition for Windows since it was included with Office XP in 2001. It became part of the operating system with Windows Vista in 2007.

User skill is a factor in the success of speech-to-text input, and you will have better results if you invest some time in learning how to talk to your PC. Learning manual typing involved a big investment in training that you have probably forgotten by now. Enhancing that skill with speech recognition will, by comparison, take much less effort — about two weeks of practice and reorientation.

Deciding what you want to say and then saying it, instead of typing it, will seem unnatural at first. But once you master it, you will find the process adds almost no effort to the overall task of composition. Moreover, composition without the effort of typing eliminates a significant barrier between you and the final product, a barrier that you may have come to take for granted.

Two quick side notes before we begin:

  • WSR is not directly connected to Cortana, the voice-activated virtual assistant built into Windows 10. WSR takes dictation and carries out simple commands; it doesn’t include Cortana’s advanced AI features. But WSR can be used to address Cortana.
  • There’s a new feature in the Windows 10 Fall Creators Update called Dictation that replicates WSR’s text-entry and text-editing functions but not its UI command capabilities. Since Dictation is limited in scope and availability, I’ll focus on the more complete Windows Speech Recognition tool in this story.

First, get a decent mic

While WSR is part of your operating system, the necessary microphone is not built into your PC. Laptop and desktop mikes will not produce worthwhile results; you need a head-mounted unit with a microphone that can be positioned a consistent distance from your mouth. That, with constant attention to enunciation, should produce input speed and accuracy superior to your typing.

Careful, consistent enunciation is key. Speak archly, as if you were a miffed radio announcer. Envision the text you want to produce as you speak the words. That will help your enunciation and make you less likely to skip one-syllable words or stumble into disfluencies like “ah, y’know.” As always, taking your time up front is faster than having to catch errors later.

In this story, I’ll show you how to use the WSR software to best effect. But first, a word about your environment: You’ll want to get away from droning ventilator noises. Office chatter, on the other hand, is rarely a problem. You don’t have to yell, so if those around you are comfortable with you talking on the phone, you will fit in using WSR.

Setting up the software

The steps that follow are basically the same whether you use Windows 7, Windows 8.x or Windows 10. I’ll note where they differ and give instructions for Windows 7 and Windows 10, the most widely used platforms.

To get started with WSR, first make sure your headset mic is plugged in. Then:

  • In Windows 10, type “speech” into the search box next to the Start button, and among the results select the Speech Recognition option (not, initially, the Speech Recognition desktop app).
  • In Windows 7, click the Start button, then Control Panel, then Ease of Access, and then Speech Recognition.

You’ll see this screen:

Windows Speech Recognition - configure pane

To launch the WSR software, click Start Speech Recognition. A setup wizard will run the first time WSR is invoked, making sure your microphone works and asking configuration questions such as whether you want WSR to learn your commonly used words and phrases by reviewing your documents and emails, whether you want to turn on the microphone with your voice (not just turn it off), and whether you want WSR to launch when you start up your computer. As part of the setup wizard, you can take a tutorial showing how to dictate text and issue commands to your computer with speech.

Later, you can change any of the configurations you chose in the wizard by returning to the Speech Recognition pane in Control Panel and clicking the “Advanced speech options” link on the left.

The fourth option on the Speech Recognition pane, training your computer to understand you, used to be essential. Nowadays it’s optional for many users — speech recognition technology has improved so much in the past several years that if you have a generic accent and a quiet background, you may find that you can be productive immediately. That said, if you start speaking to WSR and find that it doesn’t understand you, come back and do the “Train your computer” step — it’ll help!

You’ll want to print out and study the 11-page reference document (option five) which has sections for both Windows 10 and Windows 7.

With Windows 10, make sure you start up WSR before launching your word processor.

Inputting text

When you finish the setup wizard, the WSR control bar will appear at the top of the screen. (It can be moved anywhere, incidentally.) For instance, if you have a small Notepad window open at the top of your screen, you should see something like this:

Windows Speech Recognition - control bar in Notepad

The microphone icon on the left is lit bright blue when WSR is listening (as shown above). The vertical bar to the right of the icon shows the volume of the input. The text window in the middle indicates what the software is doing. On the right are the buttons for minimizing and exiting WSR.

In the future when you run WSR, you can bypass the Control Panel and launch the app directly:

  • In Windows 10, type “speech” the search box to the right of the Start button. The Windows Speech Recognition desktop app will appear at the top of the results list — select it to launch it.
  • In Windows 7, click the Start button and choose All Programs > Accessories > Ease of Access > Windows Speech Recognition.

When you do so, the WSR control bar will appear, but depending on how you have it configured, the microphone icon might be gray or dark blue, and the text windows might say “Off” or “Sleeping.” If it’s off, click the icon to activate listening mode. If it’s sleeping, say “Start listening” or click the icon. (Click the icon again to turn WSR off or put it to sleep.)

With WSR listening, say the following: “the quick brown fox jumped over the lazy dog comma one hundred twenty three thousand four hundred fifty six times at seven eight nine main street period”

Assuming you’re using 28-point Arial as the font, you’ll see this:

Windows Speech Recognition - input text

Notice that you had to speak the punctuation aloud. Meanwhile, WSR automatically capitalizes the first word of each sentence and formats the numbers.

But you’ve noticed a mistake: “main street” should be capitalized. So you say the following: “go before main street.” That puts the input cursor just in front of “main street.” Then you say: “capitalize next two words.” Main Street will be capitalized, and the correction is done.

There’s a full range of text navigation commands in WSR, listed on this handy Windows support page . Learning them will greatly facilitate use. Additionally, you can verbally invoke any key on the keyboard, including key combinations — for example, say “press control A” to select all the text in a document.

If WSR incorrectly recognizes a word you say, you can correct it by telling it to select the word and saying, “correct that.” WSR will present you with a list of similar words. If one of those possibilities is correct, you can say the number next to it (“2,” in the example below) and “OK” to insert it. Or you can repeat the word you originally said, or spell it. Using the correction facility should reduce the chances of that error being repeated.

Windows Speech Recognition - correction

Program controls

Now it’s time to save our deathless text. Typically, you can select any menu item shown on the screen by simply saying “click [menu item].” That’s trivial with a simple program like Notepad, but when using more complicated software like Microsoft Word, you are unlikely to remember the names of all available commands. For such situations WSR has the “show numbers” command. Say those two words and you’ll get a result like this:

Windows Speech Recognition - show numbers

Suddenly, everything on the screen susceptible to being clicked is shaded in light blue and overlaid with a number. (The shading will fade in and out so you can read what’s under it.) If you say a number, that button will be covered with an “OK” icon, and if it’s really the one you want, you respond by saying “OK.” (Or you can say “click” and then the number. When needed, you can also say “double-click,” “right-click,” or “shift-click.”)

You can use “show numbers” for each submenu you reach, and verbally invoke arrow keys (say “press right arrow,” for example) to move around in text boxes.

In this case we’ll say “five” and then “OK” to invoke the File menu, and then “show numbers” again for the submenu:

Windows Speech Recognition show numbers2

Saving the file for the first time, you say “click four” for “Save As.” When asked for a name, you can say “test file” (for example) and a pane pops up asking you to choose from a list of possibilities. Say the number next to the one you want and “OK,” and the file name you chose will appear in the Save dialog box. Say “click save” to complete the save.

After saving, we can close Notepad by saying “show numbers” again and then saying “three” and “OK” to select the X button in the upper right. (Saying “close” also works.)

Speaking of applications, you can switch between open apps by saying “switch to [program name].” If you have no clue as to what’s open, you can say “switch application,” and WSR will present a list of open apps. Say the number next to the app you want and “OK.”

Finally, you can open a new app by saying “open [program name],” — for instance, “open Paint” to launch the venerable Windows graphics editor.

Other controls

Speaking of Paint, WSR has facilities for mouse control, which is necessary for most graphics programs. We’ll demonstrate them using Paint since it’s built into Windows, but they should work in any graphics program. To try them out, say “open Paint” and, using the “show numbers” command, pick out the Rectangle shape control.

Then say “mousegrid.” The screen is then covered with a numbered grid, as shown:

Windows Speech Recognition - mousegrid command

Say the number of the grid section you’re interested in (in this case, “one”), and that grid section is covered by smaller three-by-three grid:

Windows Speech Recognition - mousegrid 2

You can use this method to zero in on the point you want, going six levels down if necessary. Notice that the grid covers the entire screen, not just the work area. This allows you to select buttons and menu items. Once you get a grid cell over the desired button, you can say “click [grid cell number]” to generate a mouse click at that point.

For drawing, you need to click and drag, which is a two-step process with WSR using “mark” and “click.” First, use the mousegrid command to zero in on the point where you want to begin. Once there, say “[grid cell number] mark.” The mousegrid process then automatically restarts, and you drill down to the second point. Once there, say “[grid cell number] click.”

After you’ve selected the Rectangle tool and then performed “mark and click” on the mousegrid, Paint draws a rectangle between the two points, all without you touching a mouse or keyboard:

Windows Speech Recognition - draw rectangle in Paint

Go forth and speak

You can use the same techniques with any business software to accomplish just about any task. Between the mousegrid command, the show-numbers procedure, and keyboard emulation there’s probably little you can’t do with your voice through WSR — in theory. Whether WSR is practical in a particular setting or for a particular project is something you’ll figure out as you go along.

For composing a first draft, I’ve found that using speech recognition about doubles my productivity. When it comes to editing or rewriting, when it’s time to ponder the “rhythm, texture, and flow” of the product, the difference between typing and talking is negligible. For email, where polish is usually secondary, speech recognition is a natural — but do proofread before hitting “Send.”

Related content

Download our file sync and sharing enterprise buyer’s guide, notion ai can now access slack chats and google drive files, 5 handy hidden tricks for google's pixel 9 pro fold (and pixel tablet, too), disney to ditch slack after security breach, will move to microsoft teams, from our editors straight to your inbox.

lamont_wood

Lamont Wood is a freelance writer in San Antonio.

More from this author

How to create and edit pdfs in microsoft word, speech recognition grows up and goes mobile, cpu architecture after moore’s law: what’s next, augmented reality: next-gen headsets show business promise, smartphone cpus put desktops to shame, the march toward exascale computers, sxsw highlights bright and dark tech futures, at the office, diversity works, but shorter workweeks may not, show me more, how much are companies willing to spend to get workers back to the office.

Image

HP’s new remote support service can even resurrect unbootable PCs

Image

Microsoft 365 Copilot rollouts slowed by data security, ROI concerns

Image

Podcast: New AI lies, some human truths, and more AR/VR coming

Image

Podcast: How generative AI will improve avatars and robots

Image

Podcast: Will AI get smarter now that it can reason?

Image

Will the legal system adopt or reject the use of AI?

Image

Human truths, AI lies and new AR headsets on the way

Image

Why generative AI will improve public-facing robots and avatars

Image

Sponsored Links

  • OpenText Financial Services Summit 2024 in New York City!
  • Visibility, monitoring, analytics. See Cisco SD-WAN in a live demo.

windows vista speech recognition

  • Latest Articles
  • Top Articles
  • Posting/Update Guidelines
  • Article Help Forum

windows vista speech recognition

  • View Unanswered Questions
  • View All Questions
  • View C# questions
  • View C++ questions
  • View Javascript questions
  • View Visual Basic questions
  • View .NET questions
  • CodeProject.AI Server
  • All Message Boards...
  • Running a Business
  • Sales / Marketing
  • Collaboration / Beta Testing
  • Work Issues
  • Design and Architecture
  • Artificial Intelligence
  • Internet of Things
  • ATL / WTL / STL
  • Managed C++/CLI
  • Objective-C and Swift
  • System Admin
  • Hosting and Servers
  • Linux Programming
  • .NET (Core and Framework)
  • Visual Basic
  • Web Development
  • Site Bugs / Suggestions
  • Spam and Abuse Watch
  • Competitions
  • The Insider Newsletter
  • The Daily Build Newsletter
  • Newsletter archive
  • CodeProject Stuff
  • Most Valuable Professionals
  • The Lounge  
  • The CodeProject Blog
  • Where I Am: Member Photos
  • The Insider News
  • The Weird & The Wonderful
  • What is 'CodeProject'?
  • General FAQ
  • Ask a Question
  • Bugs and Suggestions

windows vista speech recognition

Speech Recognition And Synthesis Managed APIs In Windows Vista

windows vista speech recognition

  • Download Speechpad Demo - 44.7 KB
  • Download Speechpad Source - 139.8 KB

Screenshot - Speechpad.jpg

Introduction

One of the coolest features to be introduced with Windows Vista is the new built in speech recognition facility. To be fair, it has been there in previous versions of Windows, but not in the useful form in which it is now available. Best of all, Microsoft provides a managed API with which developers can start digging into this rich technology. For a fuller explanation of the underlying technology, I highly recommend the Microsoft whitepaper . This tutorial will walk the user through building a common text pad application, which we will then trick out with a speech synthesizer and a speech recognizer using the .NET managed API wrapper for SAPI 5.3. By the end of this tutorial, you will have a working application that reads your text back to you, obeys your voice commands, and takes dictation. But first, a word of caution: this code will only work for Visual Studio 2005 installed on Windows Vista. It does not work on XP, even with .NET 3.0 installed.

Because Windows Vista has only recently been released, there are, as of this writing, several extant problems relating to developing on the platform. The biggest hurdle is that there are known compatibility problems between Visual Studio and Vista. Visual Studio.NET 2003 is not supported on Vista, and there are currently no plans to resolve any compatibility issues there. Visual Studio 2005 is supported, but in order to get it working well, you will need to make sure you also install service pack 1 for Visual Studio 2005. After this, you will also need to install a beta update for Vista called, somewhat confusingly, "Visual Studio 2005 Service Pack 1 Update for Windows Vista Beta". Even after doing all this, you will find that all the new cool assemblies that come with Vista, such as the System.Speech assembly, still do not show up in your Add References dialog in Visual Studio. If you want to have them show up, you will finally need to add a registry entry indicating where the Vista DLL's are to be found. Open the Vista registry UI by running regedit.exe in your Vista search bar. Add the following registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\AssemblyFolders\v3.0 Assemblies with this value: C:\\Program Files\\Reference Assemblies\\Microsoft\\Framework\\v3.0 . (You can also install it under HKEY_CURRENT_USER , if you prefer.) Now, we are ready to start programming in Windows Vista.

Before working with the speech recognition and synthesis functionality, we need to prepare the ground with a decent text pad application to which we will add on our cool new toys. Since this does not involve Vista, you do not really have to follow through this step in order to learn the speech recognition API. If you already have a good base application, you can skip ahead to the next section, Speechpad , and use the code there to trick out your app. If you do not have a suitable application at hand, but also have no interest in walking through the construction of a text pad application, you can just unzip the source code linked above and pull out the included Textpad project. The source code contains two Visual Studio 2005 projects, the Textpad project, which is the base application for the SR functionality, and Speechpad , which includes the final code.

All the same, for those with the time to do so, I feel there is much to gain from building an application from the ground up. The best way to learn a new technology is to use it oneself and to get one's hands dirty, as it were, since knowledge is always more than simply knowing that something is possible; it also involves knowing how to put that knowledge to work. We know by doing, or as Giambattista Vico put it, verum et factum convertuntur .

Textpad is an MDI application containing two forms: a container, called Main.cs , and a child form, called TextDocument.cs . TextDocument.cs , in turn, contains a RichTextBox control.

Create a new project called Textpad . Add the " Main " and " TextDocument " forms to your project. Set the IsMdiContainer property of Main to true . Add a MainMenu control and an OpenFileDialog control (name it " openFileDialog1 ") to Main . Set the Filter property of the OpenFileDialog to " Text Files | *.txt ", since we will only be working with text files in this project. Add a RichTextBox control to " TextDocument ", name it " richTextBox1 "; set its Dock property to " Fill " and its Modifiers property to " Internal ".

Add a MenuItem control to MainMenu called " File " by clicking on the MainMenu control in Designer mode and typing "File" where the control prompts you to "type here". Set the File item's MergeType property to " MergeItems ". Add a second MenuItem called " Window ". Under the "File" menu item, add three more Items: " New ", " Open ", and " Exit ". Set the MergeOrder property of the "Exit" control to 2. When we start building the " TextDocument " Form, these merge properties will allow us to insert menu items from child forms between "Open" and "Exit".

Set the MDIList property of the Window menu item to true . This automatically allows it to keep track of your various child documents during runtime.

Next, we need some operations that will be triggered off by our menu commands. The NewMDIChild() function will create a new instance of the Document object that is also a child of the Main container. OpenFile() uses the OpenFileDialog control to retrieve the path to a text file selected by the user. OpenFile() uses a StreamReader to extract the text of the file (make sure you add a using declaration for System.IO at the top of your form). It then calls an overloaded version of NewMDIChild() that takes the file name and displays it as the current document name, and then injects the text from the source file into the RichTextBox control in the current Document object. The Exit() method closes our Main form. Add handlers for the File menu items (by double clicking on them) and then have each handler call the appropriate operation: NewMDIChild() , OpenFile() , or Exit() . That takes care of your Main form.

To the TextDocument form, add a SaveFileDialog control, a MainMenu control, and a ContextMenuStrip control (set the ContextMenuStrip property of richTextBox1 to this new ContextMenuStrip ). Set the SaveFileDialog 's defaultExt property to " txt " and its Filter property to " Text File | *.txt ". Add "Cut", "Copy", "Paste", and "Delete" items to your ContextMenuStrip . Add a "File" menu item to your MainMenu , and then " Save ", Save As ", and " Close " menu items to the "File" menu item. Set the MergeType for "File" to " MergeItems ". Set the MergeType properties of "Save", "Save As" and "Close" to "Add", and their MergeOrder properties to 1. This creates a nice effect in which the File menu of the child MDI form merges with the parent File menu.

The following methods will be called by the handlers for each of these menu items: Save() , SaveAs() , CloseDocument() , Cut() , Copy() , Paste() , Delete() , and InsertText() . Please note that the last five methods are scoped as internal , so they can be called by the parent form. This will be particularly important as we move on to the Speechpad project.

Once you hook up your menu item event handlers to the methods listed above, you should have a rather nice text pad application. With our base prepared, we are now in a position to start building some SR features.

Add a reference to the System.Speech assembly to your project. You should be able to find it in C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\ . Add using declarations for System.Speech , System.Speech.Recognition , and System.Speech.Synthesis to your Main form. The top of your Main.cs file should now look something like this:

In design view, add two new menu items to the main menu in your Main form labeled " Select Voice " and " Speech ". For easy reference, name the first item selectVoiceMenuItem . We will use the "Select Voice" menu to programmatically list the synthetic voices that are available for reading Speechpad documents. To programmatically list out all the synthetic voices, use the following three methods found in the code sample below. LoadSelectVoiceMenu() loops through all voices that are installed on the operating system and creates a new menu item for each. VoiceMenuItem_Click() is simply a handler that passes the click event on to the SelectVoice() method. SelectVoice() handles the toggling of the voices we have added to the "Select Voice" menu. Whenever a voice is selected, all others are deselected. If all voices are deselected, then we default to the first one.

Now that we have gotten this far, I should mention that all this trouble is a little silly if there is only one synthetic voice available, as there is when you first install Vista. Her name is Microsoft Anna, by the way. If you have Vista Ultimate or Vista Enterprise, you can use the Vista Updater to download an additional voice, named Microsoft Lila, which is contained in the Simple Chinese MUI. She has a bit of an accent, but I am coming to find it rather charming. If you don't have one of the high-end flavors of Vista, however, you might consider leaving the voice selection code out of your project.

We have not declared the selectedVoice class level variable yet (your Intellisense may have complained about it), so the next step is to do just that. While we are at it, we will also declare a private instance of the System.Speech.Synthesis.SpeechSynthesizer class and initialize it, along with a call to the LoadSelectVoiceMenu() method from above, in your constructor:

To allow the user to utilize the speech synthesizer, we will add two new menu items under the "Speech" menu labeled " Read Selected Text " and " Read Document ". In truth, there isn't really much to using the Vista speech synthesizer. All we do is pass a text string to our local SpeechSynthesizer object and let the operating system do the rest. Hook up event handlers for the click events of these two menu items to the following methods and you will be up and running with an SR enabled application:

Playing with the speech synthesizer is a lot of fun for about five minutes (ten if you have both Microsoft Anna and Microsoft Lila to work with) -- but after typing "Hello World" into your Speechpad document for the umpteenth time, you may want to do something a bit more challenging. If you do, then it is time to plug in your expensive microphone, since speech recognition really works best with a good expensive microphone. If you don't have one, however, then go ahead and plug in a cheap microphone. My cheap microphone seems to work fine. If you don't have a cheap microphone, either, I have heard that you can take a speaker and plug it into the mic jack of your computer, and if that doesn't cause an explosion, you can try talking into it.

While speech synthesis may be useful for certain specialized applications, voice commands, by cantrast, are a feature that can be used to enrich any current WinForms application. With the SR Managed API, it is also easy to implement once you understand certain concepts such as the Grammar class and the SpeechRecognitionEngine .

We will begin by declaring a local instance of the speech engine and initializing it.

The speech recognition engine is the main workhorse of the speech recognition functionality. At one end, we configure the input device that the engine will listen on. In this case, we use the default device (whatever you have plugged in), though we can also select other inputs, such as specific wave files. At the other end, we capture two events thrown by our speech recognition engine. As the engine attempts to interpret the incoming sound stream, it will throw various "hypotheses" about what it thinks is the correct rendering of the speech input. When it finally determines the correct value, and matches it to a value in the associated grammar objects, it throws a speech recognized event, rather than a speech hypothesized event. If the determined word or phrase does not have a match in any associated grammar, a speech recognition rejected event (which we do not use in the present project) will be thrown instead.

In between, we set up rules to determine which words and phrases will throw a speech recognized event by configuring a Grammar object and associating it with our instance of the speech recognition engine. In the sample code above, we configure a very simple rule which states that a speech recognized event will be thrown if any of the following words: " cut ", " copy ", " paste ", and " delete ", is uttered. Note that we use a GrammarBuilder class to construct our custom grammar, and that the syntax of the GrammarBuilder class closely resembles the syntax of the StringBuilder class.

This is the basic code for enabling voice commands for a WinForms application. We will now enhance the Speechpad application by adding a menu item to turn speech recognition on and off, a status bar so we can watch as the speech recognition engine interprets our words, and a function that will determine what action to take if one of our key words is captured by the engine.

Add a new menu item labeled " Speech Recognition " under the "Speech" menu item, below "Read Selected Text" and "Read Document". For convenience, name it speechRecognitionMenuItem . Add a handler to the new menu item, and use the following code to turn speech recognition on and off, as well as toggle the speech recognition menu item. Besides the RecognizeAsync() method that we use here, it is also possible to start the engine synchronously or, by passing it a RecognizeMode.Single parameter, cause the engine to stop after the first phrase it recognizes. The method we use to stop the engine, RecognizeAsyncStop() , is basically a polite way to stop the engine, since it will wait for the engine to finish any phrases it is currently processing before quitting. An impolite method, RecognizeAsyncCancel() , is also available -- to be used in emergency situations, perhaps.

We are actually going to use the RecognizeAsyncCancel() method now, since there is an emergency situation. The speech synthesizer, it turns out, cannot operate if the speech recognizer is still running. To get around this, we will need to disable the speech recognizer at the last possible moment, and then reactivate it once the synthesizer has completed its tasks. We will modify the ReadAloud() method to handle this.

The user now has the ability to turn speech recognition on and off. We can make the application more interesting by capturing the speech hypothesize event and displaying the results to a status bar on the Main form. Add a StatusStrip control to the Main form, and a ToolStripStatusLabel to the StatusStrip with its Spring property set to true . For convenience, call this label toolStripStatusLabel1 . Use the following code to handle the speech hypothesized event and display the results:

Now that we can turn speech recognition on and off, as well as capture misinterpretations of the input stream, it is time to capture the speech recognized event and do something with it. The SpeechToAction() method will evaluate the recognized text and then call the appropriate method in the child form (these methods are accessible because we scoped them internal in the Textpad code above). In addition, we display the recognized text in the status bar, just as we did with hypothesized text, but in a different color in order to distinguish the two events.

Now let's take Speechpad for a spin. Fire up the application and, if it compiles, create a new document. Type "Hello world." So far, so good. Turn on speech recognition by selecting the Speech Recognition item under the Speech menu. Highlight "Hello" and say the following phrase into your expensive microphone, inexpensive microphone, or speaker: delete . Now type "Save the cheerleader, save the". Not bad at all.

Voice command technology, as exemplified above, is probably the most useful and most easy to implement aspect of the Speech Recognition functionality provided by Vista. In a few days of work, any current application can be enabled to use it, and the potential for streamlining workflow and making it more efficient is truly breathtaking. The cool factor, of course, is also very high.

Having grown up watching Star Trek reruns, however, I can't help but feel that the dictation functionality is much more interesting. Computers are meant to be talked to and told what to do, not cajoled into doing tricks for us based on finger motions over a typewriter. My long-term goal is to be able to code by talking into my IDE in order to build UML diagrams and then, at a word, turn that into an application. What a brave new world that will be. Toward that end, the SR managed API provides the DictationGrammar class.

Whereas the Grammar class works as a gatekeeper, restricting the phrases that get through to the speech recognized handler down to a select set of rules, the DictateGrammar class, by default, kicks out the jams and lets all phrases through to the recognized handler.

In order to make Speechpad a dictation application, we will add the default DicatateGrammar object to the list of grammars used by our speech recognition engine. We will also add a toggle menu item to turn dictation on and off. Finally, we will alter the SpeechToAction() method in order to insert any phrases that are not voice commands into the current Speechpad document as text.

Begin by creating a local instance of DictateGrammar for our Main form, and then instantiate it in the Main constructor. Your code should look like this:

Create a new menu item under the Speech menu and label it " Take Dictation ". Name it takeDictationMenuItem for convenience. Add a handler for the click event of the new menu item, and stub out TurnDictationOn() and TurnDictationOff() methods. TurnDictationOn() works by loading the local dictationGrammar object into the speech recognition engine. It also needs to turn speech recognition on if it is currently off, since dictation will not work if the speech recognition engine is disabled. TurnDictationOff() simply removes the local dictationGrammar object from the speech recognition engine's list of grammars.

For an extra touch of elegance, alter the TurnSpeechRecognitionOff() by adding a line of code to turn off dictation when speech recognition is disabled:

Finally, we need to update the SpeechToAction() method so it will insert any text that is not a voice command into the current Speechpad document. Use the default statement of the switch control block to call the InsertText() method of the current document.

With that, we complete the speech recognition functionality for Speechpad . Now try it out. Open a new Speechpad document and type "Hello World." Turn on speech recognition. Select "Hello" and say delete . Turn on dictation. Say brave new .

This tutorial has demonstrated the essential code required to use speech synthesis, voice commands, and dictation in your .NET 2.0 Vista applications. It can serve as the basis for building speech recognition tools that take advantage of default as well as custom grammar rules to build adanced application interfaces. Besides the strange compatibility issues between Vista and Visual Studio, at the moment the greatest hurdle to using the Vista managed speech recognition API is the remarkable dearth of documentation and samples. This tutorial is intended to help alleviate that problem by providing a hands on introduction to these new tools.

  • Feb 28, 2007: updated to fix a conflict between the speech recognizer and text-to-speech synthesizer.

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

windows vista speech recognition

Comments and Discussions

to use this message board.
  Layout   Per page    
First Prev
11-Nov-14 21:33 11-Nov-14 21:33 
is this code can be used for my megaproject on voicepad using C#..plz replyy fast

·  
14-Mar-12 18:30 14-Mar-12 18:30 
Good article friend.
·  
14-Aug-11 11:56 14-Aug-11 11:56 
I am having trouble getting a SpeechRecognitionEngine object to stop listening. The intent is to allow the user to enable and disable any speech recognition through menu options rather than muting their microphone. I basically want to terminate listening all together.

I have a SpeechRecognitionEngine object I call , running in its own thread, in a winforms STA application.

I tell it to begin listening with:


However, For any of the following calls:


I receive an exception "[...] of type 'System.InvalidCastException' occurred in System.Speech.dll"

Additional information: Unable to cast COM object of type 'System.Speech.Internal.SapiInterop.SpInprocRecognizer' to interface type 'System.Speech.Internal.SapiInterop.ISpRecognizer'. This operation failed because the QueryInterface call on the COM component for the interface with IID '{C2B5F241-DAA0-4507-9E16-5A1EAA2B7A5C}' failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)).

Can you suggest possible reasons this may be happening? My prime suspect is that my thread housing the SpeechRecognitionEninge object is running in STA rather than MTA but I want to avoid running my entire winform application in MTA because I don't want to lose features such as accelerated keys and the clipboard.

·  
5-Mar-11 20:33 5-Mar-11 20:33 
hi.
can anybody tell me how to open computer password using my voice.
thanks in advance.
·  
8-Feb-11 3:16 8-Feb-11 3:16 
the speech pad demo shows error message..it doesnt work properly..can anyone reply
·  
24-May-10 1:07 24-May-10 1:07 
Dear Sir ,

Whenever I wish to run the code , I debug the code & its shows error in the dialog box which is below

System.InvalidOperationException was unhandled
Message="The language for the grammar does not match the language of the speech recognizer."
Source="System.Speech"

at line " recognizer.LoadGrammar(customGrammar);" inside "private void InitializeSpeechRecognitionEngine()"

The machine is installed with Windows Vista Home Premium, SAPI 5.1 & Visual Studio 2005 with Beta Upgrade.

The Program of Speech Synthesis runs , however I am getting trouble in Speech-Recognition. Whenever I have tried with Microsoft Anna / Mike(all 4 options) it shows Fatal error.
·  
21-Feb-10 17:50 21-Feb-10 17:50 
The text to speech is simple and works good as u use sapi ... but the speech recog is completely pathetic highly misspelled words and gives total absurd results...
Is there any better work on speech recognition? Reading a speech stream to text? i dont think any api's wud support that ... can i get more info
·  
24-May-10 1:10 24-May-10 1:10 
could yu run the code of speech-recognition? I can not . I have installed sapi 5.1 & have VS 2005 wd beta upgradation,

still after debugging the code it shows error at " recognizer.LoadGrammar(customGrammar);" saying

System.InvalidOperationException was unhandled
Message="The language for the grammar does not match the language of the speech recognizer."
Source="System.Speech"

did yu do any modification?
·  
19-May-09 8:54 19-May-09 8:54 
Hello,
is it allowed to use the Vista Speech Recognition in a commercial project? I can't find anything about this on the internet!

Thanks
·  
19-May-09 11:47 19-May-09 11:47 
KR,

It's part of the OS, so you can probably think of it in terms of "integration". Microsoft allows you to use windows and popups, right? The speech recognition engine in Vista, to my thinking, is the same sort of thing. As long as you aren't repackaging elements of the OS into your own application, but instead are just hooking into it, I don't see why it is a problem.

Of course, you do have to make sure that your clients are running Vista under your scenario.

But I'm a layman and not a lawyer, and this is just a layman's opinion.

James
·  
17-May-09 10:08 17-May-09 10:08 
Is there any nice way to convert a text string into an SSML phonetic form so that it may be edited and tweaked? Better yet, is there any way to convert a text string into a more readable phonetic form which could in turn be converted to SSML? Speaking text and capturing the phonemes sort of works, but leaves out stresses and word breaks.
·  
19-May-09 11:43 19-May-09 11:43 
Supercat,

This sounds like a brilliant project. I'm afraid I'm not equipped to answer your question, however. It requires a lower level of knowledge of the speech recognition technology than I currently possess.

James
·  
5-May-09 21:53 5-May-09 21:53 
Hello,

What if a spoken word or sentence is rejected, is it posible to tell the engine that the rejected sound is equal to a specific text? or will i have to add the text to the grammer - and hope that it figures out it corrosponds to the correct sound?

Is it posible to train the engine if i have a lot of spoken wavefiles of dictation and corrosponding texts which have been transcribed from the wavefiles? i know how to use a wavefile as an input, but i still need to teach it the parts it cant translate from audio into text.

I hope you can help me on this matter as i cant seem to find anything elsewhere on the net.

Thanks in advance!

Best regards,
Simon.
·  
19-May-09 11:41 19-May-09 11:41 
guden,

You've hit on one of the key issues in speech recognition technology. We can either train a speech recognition engine to favor certain words (for instance, words common to dictation) or we can train an engine to recognize the speech patterns of a given speaker.

A lack of a solution on the Internet may be due to the fact that we haven't yet hit on a good solution. In an ideal world, a speech engine would be able to translate any speech into text -- but this isn't currently the world we live in.

If your list of custom words (the favored words) is small enough, then you might be able to solve this for your domain by having a custom dictionary (see my Sophiabot article for a walkthrough of using this). As the dictionary gets bigger, however, the accuracy diminishes and you'll have to fall back on the speech engine being able to recognize certain speakers.

It's sort of a hit or miss thing, and you'll have to try different speech engines to see if any are smart enough to deal with your specific dictionary accurately.

On the other hand, it may just come down to waiting for the speech recognition technology to catch up to the uses we want to put it to.

James
·  
11-Mar-09 1:40 11-Mar-09 1:40 
Hi there James,

this tutorial looks very,very nice and I'd really like to try it out.
Is there any chance that, you or perhaps someone else, could make an "XP version" of it ?

Best Wishes
/Thomas
·  
19-May-09 11:34 19-May-09 11:34 
I'm actually thinking of making a Windows 7 version of it. I haven't looked at XP's speech engine, which wasn't originally built into the OS (but perhaps it is now part of a service pack?).

I would welcome anyone else who wants to give a shot at an XP/Vista/Windows 7 comparison, though.

James
·  
5-Mar-09 1:38 5-Mar-09 1:38 
Thanks for the nice article, it helped me much..

I would like to use speech technology in a multilangugae application, what should i do to read the text in German or French?? also to understand German or French speech, what should i do?

Thanks for your concern
·  
25-Feb-09 7:55 25-Feb-09 7:55 
Very interesting article !

I was wondering whether it could be possible to correct misrecognized words with the SpeechRecognitionEngine to improve it (cf. the 'Correct...' option proposed by the Speech UI) ?

Thanks a lot for your answer !
·  
15-Jan-09 15:37 15-Jan-09 15:37 
I am new in programming field and just know Visual Basic right now. I would like to get coding of this article in Visual Basic if anyone tried it. James can you please send me a simple program in VB that just recognize words and display on text box.I just want to study how it works.
·  
19-Jan-09 3:31 19-Jan-09 3:31 
aamircode,

Thank you for your kind words. I'm currently on a project for a client and don't have much time, but you can try using Kamal Patel's conversion tool here: [ ], to get a vb equivalent of this code. I think you might get a lot out of simply doing the conversion -- I always find that looking at vb and c# versions of the same code helps me to understand it better.

If this doesn't work, just contact me again in a week or so, when my project is done, and I'll see what I can do to help you out.

Yours,
James
·  
12-Jun-08 3:13 12-Jun-08 3:13 
Just what I was looking for. I had my program working in XP with Interop.Speech.dll and it broke in Vista.
·  
19-Feb-08 0:04 19-Feb-08 0:04 
I tried loading your example in VS2008 with .NET Framework 3.5 and got the following error executing the line :

The language for the grammar does not match the language of the speech recognizer.

My locale was originally set to Australia but I changed this to United States however this didn't make any difference. Any ideas on how to overcome this problem would be appreciated.
·  
19-Feb-08 3:06 19-Feb-08 3:06 
Hi Ray,

I'll take a look at it on my home (Vista) computer later. In the meantime, you might try targetting the 2.0 framework to see if it makes a difference (not sure why it would, but its worth a try).

James
·  
1-Sep-07 10:13 1-Sep-07 10:13 
Hi,

This is a very useful article, it inspired me into writing a voice internet browsing software that I put up for free (younicate). Also, it's best to read the SAPI Sofia Bot article too.. If there's enough demand, I might write an article describing how to recreate a voice browsing software...

Ciao
·  
21-Aug-07 2:52 21-Aug-07 2:52 
Hi there, i tried to list all voices installed based on your code, but it only lists 2 of them but i have 7 installed, the 2 of them are the Microsoft Anna and Lili but Sam and some others are not listed, does the Engine only support the default ones shipped by Microsoft and is not accepting any third party voices ???

thanks for your answer ...
·  
Last Visit: 31-Dec-99 18:00     Last Update: 28-Sep-24 2:50

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

windows vista speech recognition

How to use Vista's Speech Recognition Abilities

  • Berry van der Linden
  • Categories : Windows platform , Computing
  • Tags : Computing windows platform topics vista support

How to use Vista's Speech Recognition Abilities

The Speech Recogniton feature

In order to use Vista’s built-in Windows Speech Recognition feature you will need to obtain a good, quality headset microphone (check the first picture below) because they do a much better job of concentrating on your voice and not the background noises. There are a few microphones that are very good at filtering out background noise.

To open Windows Speech Recognition, open the Start Menu and select All Programs > Accessories > Ease of Access > Windows Speech Recognition. Alternately, you can just click the Start Menu and type speech, wait for the computer to bring up the search results, and then click Speech Recognition Options from the list that appears. The Wizard that appears will take you through configuring your computer to get used to your voice. We will start with the basic setup of your computer, like checking volume and printing common voice commands, which will lead to the speech recognition tutorial. Take the tutorial; it shows you how this feature works, the things you should expect to see when the program is working and all the basic commands. This can be a time consuming process but it will help the speech regognition software to get used to the nuances in your voice.

Once you have everything up and running, you are set to start using the voice recognition software. Expect there to be growing pains during this time; this is new software and there will be times when you will be frustrated, but the time Microsoft’s customers spend with this feature and their feedback given to Microsoft will allow Microsoft to improve future releases.

This tool can be useful, and you can continue to customize the tool so it works best for you. To do this, open the Ease of Access Center via the Control Panel and choose Speech Recognition Options. Here you will find you can once again take the tutorial or take it for the first time, and you can print out and view the common voice commands in the Speech Reference Card. The link “Train Your Computer To Better Understand You” shows you text one line at a time. As you read it back to the PC (as normally as possible), the Speech Recognition engine will learn to recognize your voice.

You can also configure the Speech Recognition tool to work with multiple users. If you share your computer with other users, every one can set up their own profile. You may even want to set up multiple profiles for yourself to use your computer in varying environments. You will find profiles in the Advanced Speech Options dialog box (see picture below). Open the Ease of Access Center via the Control panel, choose Speech Recognition Options, and then look for the Advanced Options link in the far left panel of the Window.

Speech Recognition Tutorial Images

Basics Screen

Computer Hope

How to use the Windows Speech Recognition feature

Windows logo

Microsoft Windows Vista, 7, 8, 10, and 11 include a speech recognition feature in English, French, German, Japanese, Simplified Chinese, Spanish, and Traditional Chinese. Below are the steps to set up, start, and use Windows Speech Recognition, and common commands and features.

  • How to start Windows Speech Recognition.
  • Using Windows Speech Recognition.
  • Windows Speech Recognition commands.
  • Using the mouse with speech recognition.

How to start Windows Speech Recognition

HyperX Cloud headset with microphone

First, make sure your computer has either a built-in microphone, or an external microphone or headset connected to it. We recommend using a headset with a microphone since it tends to give the best results.

  • How to connect a microphone to a computer.
  • How to connect a headset to a computer.

After connecting one of these devices, open and configure the Windows Speech Recognition feature by following the steps below.

  • Press the Windows key , type Control Panel , and press Enter . Or, open the Start menu , and click Windows System > Control Panel .
  • In the Control Panel , click Ease of Access in the lower-right corner of the window.

Ease of Access link

  • On the next screen, click Speech Recognition . In Windows 11, click the Start speech recognition option instead.

Speech Recognition link

  • Select the type of microphone ( input device ) connected to your computer, then click Next .

If you are using a built-in microphone, select the Other input device type.

Speech Recognition input device choose.

  • On the screen asking you to choose an activation mode , select manual or voice , depending on your preferences. With manual activation mode , you must start the Speech Recognition utility each time you want to use it. With voice activation mode , you can say "start listening" to begin speech recognition.

Speech Recognition startup options.

  • Upon success, you see Windows Speech Recognition controls at the top of the screen (shown in the next section).

Using Windows Speech Recognition

Once the Windows Speech Recognition is running, a microphone bar appears in the top center of your screen, as shown in the picture. By default, Windows Speech Recognition should be sleeping. To start Windows Speech Recognition, say "Start listening."

Windows Speech Recognition commands

Windows Speech Recognition can type any text or perform several commands. Below are the common commands to help you start using this feature immediately. To type what you are saying and the program is listening, you can say anything to start typing. When the text is correct, say "Insert" to confirm and place it in the document.

The absolute basics

  • Start listening - Make Windows Speech Recognition listen to your commands.
  • Stop listening - Put the program to sleep and ignore any talking.
  • Undo - Undo anything done with the voice.
  • Cancel - Cancel last command.
  • Caps <word> - Capitalize the first letter of the word.
  • All caps <word> - Capitalize all letters of the word.
  • No caps <word> - Lowercase all letters of the word.
  • New line - Start a new line of text.
  • New paragraph - Start a new paragraph.
  • Go to <word> - Move to the beginning of the word that was typed.
  • What can I say? - Open a help screen with commands and additional help.

Handling programs

  • Open <program> - Open a program. For example, say "open Excel" to open Microsoft Excel .
  • Switch to <program> - Switch to an open program. For example, say "switch to chrome" to switch to the Google Chrome browser.
  • Close <program> - Close an open program. For example, say "close notepad" to close the Notepad program.
  • Minimize - Minimize current window.

Using the mouse with speech recognition

Windows Speech Recognition can also be used to perform actions that a mouse normally does. Below are all the commands to control, click, and move your mouse using your voice.

Using the mousegrid

To click any portion of the screen, use the mousegrid feature. Saying "mousegrid" opens a grid overlay similar to the example shown below. Saying the number of the area you want to click zooms in and displays another grid. Continue to say the number of the area you want to click until it's at the correct location, and say "click."

Mousegrid feature for selecting a section of the screen

Mouse related commands

  • Show numbers - Show numbers of what can be clicked on the current window. Once numbers are shown, say the number you want to click and say "ok."
  • Click - Click the default button once.
  • Double-click - Double-click the default mouse button.
  • Scroll <direction> - Scroll the screen up, down, left, or right.
  • Scroll <direction> x pages - Scroll up, down, left, or right x number of pages. For example, say "scroll down 3 pages."
  • Select <word > - Select a word that was entered using your voice.
  • Correct <word> - Correct a selected word to something else.
  • Delete <word> - Delete a word.
  • Select all - Select all text.
  • Cut - Cut selected text.
  • Copy - Copy selected text.
  • Paste - Paste text in the clipboard .

Related information

  • What programs can I use for speech recognition?
  • See our voice recognition definition for further information and related links.
  • Microsoft Windows help and support.
  • Industry Links
  • Privacy Policy
  • Terms and Conditions
  • WSR Software
  • Trigram Technologies
  • VoiceComputer

Add Features to Windows Speech Recognition

Wsrtoolkit v.3, create speech macro shortcuts for your pc without programming, increase voice-to-text accuracy, and create a personalized user system that perfectly matches your needs., software for wsr (windows speech recognition), we specialize in extension software for windows™ speech recognition (wsr) - the free "speech recognition" system built into microsoft's™ most recent operating systems (thoroughly tested in vista through to the new windows™ 11). our wsrtoolkit v.3 includes easy to create macro shortcuts without programming and accuracy features among others. wsr itself allows a user to control applications and the windows™ desktop user interface through voice commands. users can dictate text within documents, email, and forms; control the operating system user interface; perform keyboard shortcuts; and move the mouse cursor. the majority of integrated applications in windows™ can be controlled using wsr. our wsrtoolkit v.3 works in conjunction with wsr. the default installation of wsr lacks many features taken for granted by users of dragon naturallyspeaking™. wsrtoolkit v.3 adds these features and more at a price point far below it's competitors. our wsrtoolkit v.3 provides the usability you expect your speech recognition software to have. this includes training the system to understand your specific speech patterns and enunciation clarity the building and use of macros allows detailed actions in applications to be run on demand. one customer wrote: “using wsrmacros: the user’s guide and wsrtoolkit , i wrote a series of commands for quickbooks that imported customer and invoice information from our order system. entering 20-30 orders by hand used to take 1-2 hours a day and now takes less than 2 minutes from start to finish.”, backup works perfectly.

I just wanted to say THANK YOU!!!  The backup thing works PERFECTLY!  Thank you so much!  I love this program, and just purchased the full version and the manual.  Thanks again you are AWESOME!

Saved me hundreds of hours

Thanks so much for a great product.  It has saved me hundreds of hours.

Incredibly Accurate!

This is incredibly accurate and useful.  Thanks so much!

Thank you for the help

Congratulations, you have solved another one of my curious problems.  Yes, whatever it was that you did corrected the issue.  I am up and running again, thanks to you.  🙂

Enjoy Transcription feature

Keep up the great work. My deaf wife enjoys the text transcripts from audio produced through the Transcription feature.

Personal followup

The personal followup is very much appreciated,

Customer Service

I just wanted to thank you for your customer service.

Great customer service

You're the best for checking all of this with me!  What great customer service!!!

I enjoy using it

The WSRToolKit v.3 is a very valuable enhancement to Windows Speech Recognition.  I'm enjoying using it.  Thank you so much once again for your help.

I've dictated this e-mail with the voice recognition software and the Revolabs microphone! Once again, many thanks for the great customer service.

I will recommend you to others

The service that you have provided has not gone unnoticed and I will certainly be recommending you to others.

Needed to go “Hands Free” quickly

I ruptured a disc in my spine so I had to go hands-free in a hurry. Your recommendation did the trick first-try when I couldn't really afford to fiddle with optimal hardware.

This stuff is absolutely phenomenal. One of the best investments, I have ever made. Thank you, and keep up the good work.

Customer for Life!

This kind of prompt, knowledgeable, customer service is so rare these days. Thank you very much. I'm a customer for life, and I'll be sure to recommend you here at Microsoft.

Never used Windows™ Speech Recognition? It’s easy to set up!

  • First, purchase the best quality headset/microphone within your budget (high-quality input gives the best results).
  • Second, open your Windows “Control Panel” and open “Speech Recognition” ( visit Wikipedia posting ).
  • Third, click on the first link to “Start Speech Recognition.” The first screen for setting up speech recognition explains what the feature does and how it works.
  • Finally, click Next and you will enter a setup wizard to optimize your microphone.

Now that you have WSR setup and have tested it, you will not see fine-tuning for accuracy, adding shortcuts, transcribing recorded audio or the option to build a custom macro. This is where WSRToolkit v.3 and WSRMacros: The User’s Guide really show their value!

The WSRToolkit v.3  user interface:

WSRToolkit V.3 for Windows Speech Recognition Interface

free information downloads

Getting started.

Download/Read about getting started with Windows Speech Recognition.

WSR Reference Card

Download/Read the Windows Speech Recognition Reference Card - good for all versions of Windows.

WSR Commands

Download/Read a Comprehensive List of WSR Commands - Word Document with Approximately 400 Commands.

How Not to Read into WSR

Listen/Download MP3 "How Not To Read into WSR".

How To Read into WSR

Listen/Download MP3 "How to Read into WSR properly".

Sample Passage Reading

Listen/Download MP3 "How to read a passage into WSR".

Our SPEECH RECOGNITION PRODUCTS

WSRToolkit and Windows Certified Logo

WSRToolkit v.3 for Windows Speech Recognition

WSRMacros

WSRMacros: The User’s Guide for Windows

Perfectly adapted for special needs users wsrtoolkit v.3 for windows speech recognition.

windows vista speech recognition

Use voice recognition in Windows

On Windows 11 22H2 and later, Windows Speech Recognition (WSR) will be replaced by voice access starting in September 2024. Older versions of Windows will continue to have WSR available. To learn more about voice access, go to Use voice access to control your PC & author text with your voice .

Set up a microphone

Before you set up speech recognition, make sure you have a microphone set up.

Select  (Start) > Settings  >  Time & language > Speech .

The speech settings menu in Windows 11

The Speech wizard window opens, and the setup starts automatically. If the wizard detects issues with your microphone, they will be listed in the wizard dialog box. You can select options in the dialog box to specify an issue and help the wizard solve it.

Help your PC recognize your voice

You can teach Windows 11 to recognize your voice. Here's how to set it up:

Press  Windows logo key  +Ctrl+S . The Set up Speech Recognition wizard window opens with an introduction on the Welcome to Speech Recognition page.

Tip:  If you've already set up speech recognition, pressing  Windows logo key  +Ctrl+S opens speech recognition and you're ready to use it. If you want to retrain your computer to recognize your voice, press the Windows logo key  , type Control Panel , and select Control Panel in the list of results. In Control Panel , select Ease of Access > Speech Recognition > Train your computer to better understand you .

Select Next . Follow the instructions on your screen to set up speech recognition. The wizard will guide you through the setup steps.

After the setup is complete, you can choose to take a tutorial to learn more about speech recognition. To take the tutorial, select Start Tutorial in the wizard window. To skip the tutorial, select Skip Tutorial . You can now start using speech recognition.

Windows Speech Recognition commands

Before you set up voice recognition, make sure you have a microphone set up.

Select the  Start    button, then select  Settings   >  Time & Language > Speech .

windows vista speech recognition

You can teach Windows 10 to recognize your voice. Here's how to set it up:

In the search box on the taskbar, type Windows Speech Recognition , and then select Windows Speech Recognition  in the list of results.

If you don't see a dialog box that says "Welcome to Speech Recognition Voice Training," then in the search box on the taskbar, type Control Panel , and select Control Panel in the list of results. Then select Ease of Access > Speech Recognition > Train your computer to understand you better .

Follow the instructions to set up speech recognition.

Facebook

Need more help?

Want more options.

Explore subscription benefits, browse training courses, learn how to secure your device, and more.

windows vista speech recognition

Microsoft 365 subscription benefits

windows vista speech recognition

Microsoft 365 training

windows vista speech recognition

Microsoft security

windows vista speech recognition

Accessibility center

Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.

windows vista speech recognition

Ask the Microsoft Community

windows vista speech recognition

Microsoft Tech Community

windows vista speech recognition

Windows Insiders

Microsoft 365 Insiders

Find solutions to common problems or get help from a support agent.

windows vista speech recognition

Online support

Was this information helpful?

Thank you for your feedback.

Quick Tips: Windows Vista Speech recognition

240718-site-ios-18-public-beta-impressions-1

iOS 18 Hands-On: Exploring the Big Design Changes

240521-site-microsoft-build-supercut-thumbnail

Everything Announced at Microsoft Build 2024

240516-site-hey-siri-lets-talk

If Apple Makes Siri Like ChatGPT or Gemini, I'm Done

Every AI Thumbnail

How Many Times Did Google Say AI at I/O 2024?

240513-site-google-io-supercut-thumbnail-v2

Everything Google Just Announced at I/O 2024

240404-yt-omt-ios-18-siri-ai-v06

Latest iOS 18 Rumor Roundup: New Designs, AI Tricks

p1004061

We Tried iOS 17 for Ourselves and We're Impressed So Far

230622-site-widgets-apple-one-more-thing

Apple Is Pushing Widgets Everywhere — Here's Why

vmstill

iOS 17 Brings Big Changes to Old Habits: Live Voicemail, AirDrop and Siri

230605-clean-ios-17-walkthrough

WWDC 2023: Here Are All the Major iOS 17 Features

apple-core.png

The Apple Core

alphabet-city.png

Alphabet City

top-5.png

The Daily Charge

what-the-future.png

What the Future

tech-today.png

Cooley On Cars

carfection.png

Latest News

240926-site-omt-ep-91-lets-talk-about-the-iphone-16-pro-camera-4

Apple's New Camera Control Button Is Bad! My iPhone 16 Pro Max Reaction

240926-site-samsung-press-event-tab-hands-on-1

Samsung Galaxy Tab S10 Android Tablet Hands On: Now with Galaxy AI

240925-site-meta-connect-supercut-thumbnail

Watch Everything Revealed at Meta Connect 2024

240925-site-metas-orion-concept-ar-glasses-2

My First Dive Into Meta's Orion AR Glasses And Neural Wristband

quest-3s-site

I Tried Out Meta's More Affordable Quest 3S and New Ray-Ban AI Features

240923-site-roku-ultra-hd-hands-on-1

Roku Ultra 2024: Backlit Remote Upgrades Roku's $100 Player

240920-site-apple-watch-series-10-vs-series-9-spec-comparison-1

Apple Watch Series 10 vs. Series 9: Spec Breakdown

screenshot-2024-09-19-at-10-41-42am.png

We Visited an Energy-Independent Island

240919-site-watch-airpod-ios-18-reactions-3

A New Apple Era: Watch 10 and AirPods 4 Land Today, My First Impressions

240918-site-hidden-ios-18-features-2

11 Hidden Features in iOS 18

  • All about AI
  • Google Gemini AI
  • Inflection AI Pi Chatbot
  • Anthropic Claude
  • Multimodal AI
  • Generative AI
  • AI Image Generation
  • AI VIdeo Generation
  • AI Coding Assistants
  • AI Regulation
  • AI Research
  • Large Language Models (LLM)
  • Microsoft Azure
  • Google Cloud
  • Amazon Web Services (AWS)
  • Surface Pro
  • Surface Laptop
  • Surface Book
  • Surface Duo
  • Surface Neo
  • Surface Studio
  • Surface Hub
  • Surface Pen
  • Surface Headphones
  • Surface Earbuds
  • About WinBuzzer
  • Follow Us: PUSH, Feeds, Social
  • Join our Team
  • Cookie Policy and Privacy Policy
  • Terms of Service

WinBuzzer

Windows 11 Voice Access to Replace Outdated Vista-Era Speech Recognition Technology.

Windows Speech Recognition is being retired, replaced by the more advanced and offline-capable Voice Access in Windows 11.

Luke Jones

Microsoft has announced the deprecation of Windows Speech Recognition , indicating a strategic shift towards the more modern and sophisticated Voice Access technology in Windows 11 versions 22H2 and 23H2. The company has affirmed that the traditional speech recognition service will no longer receive updates, cementing Voice Access as the primary mode of voice-operated control and text authorship on the platform.

Voice Access Takes the Reins

Voice Access is designed to empower all users, especially those with mobility disabilities, to navigate and control their computers using voice commands. The new feature enables tasks such as opening applications, web browsing, and composing emails solely through vocal instructions. Voice Access operates on modern on-device speech recognition technology, enhancing accuracy with the compelling advantage of functioning offline, thus ensuring reliable performance without necessitating an internet connection.

A Look Back at Speech Recognition

Speech Recognition was initially introduced as a stand-alone feature in 2006 with the launch of Windows Vista , with the intention of enhancing the operating system's accessibility. However, it faced unexpected challenges, including exploitation by malicious actors. Notably, Microsoft reported significant achievements in speech recognition technology in 2016, reaching a milestone in September by attaining a record-low word error rate, and in the following month, by matching human parity in recognition accuracy. Despite these advancements, the evolution of technology and user needs has spurred the transition to the more versatile Voice Access.

Reinforcing the Future of Accessibility

As part of its commitment to inclusive technology, Microsoft has enriched its accessibility offerings, integrating features such as Live Captions, enhanced Narrator Voices, and Narrator Extensions, alongside Voice Access. Together, these innovations reflect the company's ongoing dedication to facilitating a user-friendly and barrier-free computing experience for a diverse user base.

The full list of features Microsoft has deprecated in favor of more advanced technologies, as well as further details on the accessibility enhancements added to Windows 11, is available for review on the company's official website .

  • Desktop Operating Systems
  • Microsoft Windows
  • Operating Systems
  • Speech Recognition
  • Voice Access
  • voice commands
  • Windows Vista

Luke Jones

Recent News

openai logo

OpenAI CEO Sam Altman Seeks Biden’s Support for Massive AI Datacenter...

windows vista speech recognition

Hugging Face Grows to 1 Million AI Models and Debuts HuggingChat...

Nvidia-Blackwell-GPU-Superchips

Leak: NVIDIA RTX 5090 Specs Hint at Massive 600W Power Draw

WinBuzzer

  • Mobile Site
  • Staff Directory
  • Advertise with Ars

Filter by topic

  • Biz & IT
  • Gaming & Culture

Front page layout

Biz & IT —

Wsr macros extend windows vista’s speech recognition feature, microsoft has launched the first technical preview of windows speech ….

Emil Protalinski - Apr 30, 2008 9:46 pm UTC

windows vista speech recognition

Every new macro file created by the user will be digitally signed by default, ensuring the file cannot be changed or tampered with. Choosing the "New Speech Macro…" brings up a choice of four types of macros:

windows vista speech recognition

Microsoft is asking for feedback and comments on WSR Macros to be e-mailed to [email protected] , assuming that the problem found is not one of the four known issues in this release:

  • Set Security Level Task Dialog not accessible: The dialog box for setting the security level cannot be controlled using speech in this release.
  • Vague error messages when signing macro files: Currently, when there is an error with the signing or validation of signed macro files, an error code is returned with no additional information. Ensure that you have a signing certificate stored on the machine before trying again. More helpful error messages will be included in a future release.
  • Unsupported Characters can be entered by Wizards: When using the Macros Wizard to insert text, the text you provide is not validated and may cause an error on execution. For example, entering either a '<' or '>' (greater than or less than characters) will cause an error. This will be fixed in a later version.
  • Restart WSR Macros after changing Security Level: When going from a low to a high security level, there may be some unsigned macro files loaded that could be executed. To be sure only those macro files that are signed are loaded, restart WSR Macros.

Microsoft has not given a release date for the final version. While this release is Vista-only, it is not clear whether the software will be available as an optional download for Windows 7 or whether it will come included. Windows 7 is expected to have improved speech recognition features, along with improvements to other alternative inputs, such as multitouch.

Further reading

  • Microsoft: Windows Speech Recognition Macros (.doc)

reader comments

Channel ars technica.

Article Categories

Book categories, collections.

  • Technology Articles
  • Computers Articles
  • Operating Systems Articles
  • Windows Articles
  • Windows 10 Articles

How to Set Up Speech Recognition in Windows 10 on Your New Computer

Windows 10 all-in-one for dummies.

Book image

Sign up for the Dummies Beta Program to try Dummies' newest way to learn.

Windows depends on you to make settings that customize its behavior on your computer. This is good news for you because the ability to customize Windows gives you a lot of flexibility in how you interact with it.

One way to customize Windows to work with physical challenges is to work with the Speech Recognition feature, which allows you to input data into a document using speech rather than a keyboard or mouse.

If you have dexterity challenges from a condition such as arthritis, you might prefer to speak commands, using a technology called speech recognition, rather than type them. Attach a desktop microphone or headset to your computer, enter "Speech recognition" in Cortana's search field, and then press Enter.

The Welcome to Speech Recognition message (see the following figure ) appears; click Next to continue. ( Note: If you've used Speech Recognition before, this message won't appear. These steps are for first-time setup).

image0.jpg

In the resulting window (shown in the following figure ), select the type of microphone that you're using and then click Next. The next screen tells you how to place and use the microphone for optimum results. Click Next.

image1.jpg

In the following window (see the following figure ), read the sample sentence aloud. When you're done, click Next. A dialog box appears telling you that your microphone is now set up. Click Next.

image2.jpg

During the Speech Recognition setup procedure, you're given the option of printing out commonly used commands. It's a good idea to do this, as speech commands aren't always second nature!

A dialog box confirms that your microphone is set up. Click Next. In the resulting dialog box, choose whether to enable or disable document review, in which Windows examines your documents and email to help it recognize your speech patterns. Click Next.

In the resulting dialog box, choose either manual activation mode, where you can use a mouse, pen, or keyboard to turn the feature on, or voice activation, which is useful if you have difficulty manipulating devices because of arthritis or a hand injury. Click Next.

In the resulting screen, if you want to view and/or print a list of Speech Recognition commands, click the View Reference Sheet button and read or print the reference information, and then click the Close button to close that window. Click Next to proceed.

In the resulting dialog box, either leave the default Run Speech Recognition at Startup check box to automatically turn on Speech Recognition when you start your computer or deselect that setting and turn Speech Recognition on manually each time you need it. Click Next.

The final dialog box informs you that you can now control the computer by voice, and offers you a Start Tutorial button to help you practice voice commands. Click that button and follow the instructions to move through it, or click Skip Tutorial to skip the tutorial and leave the Speech Recognition setup.

When you leave the Speech Recognition setup, the Speech Recognition control panel appears (see the preceding figure). Say, "Start listening" to activate the feature if you used voice activation, or click the Microphone on the Speech Recognition control panel if you chose manual activation. You can now begin using spoken commands to work with your computer.

image3.jpg

To stop Speech Recognition, say, "Stop listening" or click the Microphone button on the Speech Recognition control panel. To start the Speech Recognition feature again, click the Microphone button on the Speech Recognition control panel.

About This Article

This article can be found in the category:.

  • Windows 10 ,
  • How to Use Android and iPhone Devices with Windows 10
  • How to Work Remotely with Windows 10
  • Your Laptop and Windows
  • How to Create Microsoft and Local Accounts in Windows 10
  • How to Open Windows 10 Apps
  • View All Articles From Category

windows vista speech recognition

October 09, 2023

Share this page

Facebook icon

Operate your PC hands-free with Speech Recognition

If you’re looking for ways to engage with your computer without using your hands, you can operate your Windows 11  PC using your voice with Speech Recognition. Learn about Windows 11’s Speech Recognition features and how to activate them on your device for hands-free access.

What is Windows 11 Speech Recognition

Speech Recognition is a powerful tool that greatly simplifies the way you use your PC. It allows you to control your computer using only your voice. Speech Recognition software makes it possible to start programs, navigate menus, write text, search the internet, and access different parts of your computer just by talking into your PC’s mic. By getting to know your specific voice, Speech Recognition can improve its capabilities and do the best job possible of following your commands.

How to set up and activate Speech Recognition

To get Speech Recognition up and running on your Windows 11 computer, there are a few steps to follow:

Set up your microphone

Make sure your microphone is set up correctly:

  • Select the Windows logo key followed by the Settings icon.
  • Navigate to Time & language > Speech .
  • Under Microphone , select Get Started .
  • The Speech wizard window will open, where you can ensure that your microphone is working properly.

Set up Speech Recognition

In this step, you will teach your device to recognize your voice:

  • Select Windows logo key + Ctrl + S .
  • The Set Up Speech Recognition wizard window will open.
  • Select Next and follow the instructions.

Turn Speech Recognition on and off

Once Speech Recognition is set up on your computer, make sure it’s activated and learn how to quickly turn it on and off with these steps:

  • Navigate to Accessibility > Speech .
  • Toggle on Windows Speech Recognition .
  • You can now turn Speech Recognition on or off by selecting Windows logo key + Ctrl + S .

That’s it! You’re ready to use these Speech Recognition commands  to operate your device with your voice.

Enjoy the Windows 11 Speech Recognition tool to open windows, search the web, and so much more, all hands-free. For other tips on making the most of Windows 11 head to the Windows Learning Center .

Products featured in this article

Windows 11 logo

More articles

Person sitting on couch using Windows laptop

How to find and enjoy your computer's accessibility settings

A man wearing a headset and mic while typing on a Windows laptop

Boost your productivity with a voice recorder for your PC

A person typing on a Windows laptop

Boost productivity with these Windows keyboard shortcuts

IMAGES

  1. Free Speech Recognition Tutorial 1

    windows vista speech recognition

  2. Windows Vista Speech Recognition Tutorial

    windows vista speech recognition

  3. Windows Vista Speech Recognition Tutorial

    windows vista speech recognition

  4. Speech Recognition Tutorial Windows Vista

    windows vista speech recognition

  5. Using Windows Vista Speech Recognition

    windows vista speech recognition

  6. Windows Vista Speech Recognition Sounds

    windows vista speech recognition

VIDEO

  1. Windows Vista Speech On has BSOD

  2. Windows Vista speech recognition Tutorial welcome sequence on both Screens

  3. Windows Vista Speech Recognition Tested

  4. windows vista killscreen (text to speech) New members

  5. Windows 7 and Vista speech recognition tutorial 

  6. Vista : speech recognition tutorial

COMMENTS

  1. Exploring Speech Recognition And Synthesis APIs In Windows Vista

    With the Windows Vista speech recognition technology, Microsoft has a goal of providing an end-to-end speech experience that addresses key features that users need in a built-in desktop speech recognition experience. This includes an interactive tutorial that explains how to use speech recognition technology and helps the user train the system ...

  2. Windows Vista Speech Recognition Tutorial

    This is the origin of the sound often misattributed to Windows Vista Beta 1. This sound is used in the welcome sequence of the Speech Recognition tutorial in...

  3. Windows Speech Recognition

    Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables voice commands to control the desktop user interface, dictate text in electronic documents and email, navigate websites, perform keyboard shortcuts, and operate the mouse cursor. It supports custom macros to perform additional or ...

  4. Use Speech Recognition in Vista [How To]

    In Windows Vista, Windows Speech Recognition works in the current language of the OS. That means that in order to use another language for speech recognition, you have to have the appropriate language pack installed. Language packs are available as free downloads through Windows Update for the Ultimate and Enterprise versions of Vista.

  5. How to Use Windows Vista Speech Recognition Software

    In this video, we show you how to configure and use Windows Vista Speech Recognition software. This video and over 70 other Windows Vista hundreds more are a...

  6. Speech Recognition in Windows Vista

    When you're using Windows Vista voice recognition and you tell it to "show numbers," the current window has numbered regions overlaid on a user interface elements, so that they can be easily selected just by saying a number. For example, notice the interface of Windows Live Writer as seen below. Even though the default interface will click when ...

  7. Speech Recognition Tutorial Windows Vista

    How to get Speech Recognition to work! Unedited Version of me introducing Windows Vista's Speech Recognition software.

  8. How to use Windows' built-in speech recognition

    In Windows 10, type "speech" into the search box next to the Start button, and among the results select the Speech Recognition option (not, initially, the Speech Recognition desktop app).

  9. Speech Recognition And Synthesis Managed APIs In Windows Vista

    One of the coolest features to be introduced with Windows Vista is the new built in speech recognition facility. To be fair, it has been there in previous versions of Windows, but not in the useful form in which it is now available. Best of all, Microsoft provides a managed API with which developers can start digging into this rich technology.

  10. How to use Vista's Speech Recognition Abilities

    Speech Recognition is one of the newest editions and certainly one of the most impressive new features Microsoft has introduced with Vista, but you will need a high quality microphone to make this possible. ... In order to use Vista's built-in Windows Speech Recognition feature you will need to obtain a good, quality headset microphone (check ...

  11. How to use the Windows Speech Recognition feature

    Press the Windows key, type Control Panel, and press Enter. Or, open the Start menu, and click Windows System > Control Panel. In the Control Panel, click Ease of Access in the lower-right corner of the window. On the next screen, click Speech Recognition. In Windows 11, click the Start speech recognition option instead.

  12. WSRToolkit for Windows Speech Recognition

    We specialize in extension software for Windows™ Speech Recognition (WSR) - the free "speech recognition" system built into Microsoft's™ most recent operating systems (thoroughly tested in Vista through to the new Windows™ 11). Our WSRToolkit v.3 includes easy to create macro shortcuts without programming and accuracy features among others.

  13. Vista Speech Recognition

    Vista Speech Recognition Hi, I have Vista Premium with a Logitech USB microphone. I can record my voice using the Sound Recorder with the microphone and there is movement of the mini-equalizer in the Sound Dialog-Recording Tab under the USB Microphone, so it is functional.

  14. Use voice recognition in Windows

    Help your PC recognize your voice. You can teach Windows 11 to recognize your voice. Here's how to set it up: Press Windows logo key +Ctrl+S. The Set up Speech Recognition wizard window opens with an introduction on the Welcome to Speech Recognition page. Tip: If you've already set up speech recognition, pressing Windows logo key +Ctrl+S opens ...

  15. Quick Tips: Windows Vista Speech recognition

    Talk to your computer and it will do what you say. We'll show you how to set up speech recognition in Vista.

  16. Windows 11 Voice Access to Replace Outdated Vista-Era Speech

    A Look Back at Speech Recognition. Speech Recognition was initially introduced as a stand-alone feature in 2006 with the launch of Windows Vista, with the intention of enhancing the operating ...

  17. WSR Macros extend Windows Vista's speech recognition feature

    Microsoft has released the first Technical Preview (think pre-beta) of Windows Speech Recognition (WSR) Macros, an extension of the Windows Speech Recognition capabilities in Windows Vista. The 2 ...

  18. How to Set Up Speech Recognition in Windows 10 on Your New ...

    Attach a desktop microphone or headset to your computer, enter "Speech recognition" in Cortana's search field, and then press Enter. The Welcome to Speech Recognition message (see the following figure) appears; click Next to continue. (Note: If you've used Speech Recognition before, this message won't appear.

  19. Windows Vista Speech Recognition

    Microsoft demos the new speech recognition in Vista. Unfortunately it doesn't quite go to plan...Of course this was a long time ago, Vista's fine today. As a...

  20. Using Speech Recognition on Your PC

    Set up Speech Recognition. In this step, you will teach your device to recognize your voice: Select Windows logo key + Ctrl + S. The Set Up Speech Recognition wizard window will open. Select Next and follow the instructions. Turn Speech Recognition on and off. Once Speech Recognition is set up on your computer, make sure it's activated and ...

  21. Windows Vista Speech Recognition Sounds

    Sounds from Windows Vista Speech Recognition.Download: https://drive.google.com/drive/folders/17XASyPV4nGrexPvejSHAE_JFPleU60My

  22. Dad Tries Out Windows Vista Speech Recognition (2007)

    Everybody knows that Windows Vista Speech Recognition was terrible, but just how much of a train wreck was it? Well, lets put it to the test! Can Ben write a...